3

Is there any way in Python to reverse the formating operation done through the "%" operator ?

formated = "%d ooo%s" % (12, "ps")
#formated is now '12 ooops'
(arg1, arg2) = theFunctionImSeeking("12 ooops", "%d ooo%s")
#arg1 is 12 and arg2 is "ps"

EDIT Regexp can be a solution for that but they are harder to write and I suspect them to be slower since they can handle more complex structures. I would really like an equivalent to sscanf.

AsTeR
  • 7,247
  • 14
  • 60
  • 99
  • 5
    Yes: use regular expressions. – Marcin Jan 31 '12 at 17:58
  • 1
    possible duplicate of [sscanf in Python](http://stackoverflow.com/questions/2175080/sscanf-in-python) – Michael Mrozek Jan 31 '12 at 17:59
  • @MichaelMrozek thanks, I did forgot the name of that C function – AsTeR Jan 31 '12 at 18:01
  • Is there a reason you don't want to use regular expressions? It would help if we could see what you want to do. – Nathan Jones Jan 31 '12 at 18:04
  • -1. Looks to me like a duplicate of [*sscanf in Python*](http://stackoverflow.com/questions/2175080/sscanf-in-python). To summarise the answer there: someone's written `sscanf()` in Python, but regular expressions are the better tool in Python. @AsTeR, you "suspect" RE's to be slower; have you pulled out your profiler and measured it? – Jim DeLaHunt Jan 31 '12 at 18:22
  • 1
    @JimDeLaHunt no, I didn't. I would have to have a sscanf equivalent to do so, don't I ? – AsTeR Jan 31 '12 at 18:44
  • I think this question is not the same as http://stackoverflow.com/questions/2175080/sscanf-in-python as it emphasizes the need to have a two way method. Re is only a parsing tool (yet very powerful), and the % approach and string.format (btw) are printing methods. The good thing with sscanf is that it is the reverse of sprintf (i.e. % in python) – Juh_ Sep 14 '12 at 13:06

1 Answers1

6

Use regular expressions (re module):

>>> import re
>>> match = re.search('(\d+) ooo(\w+)', '12 ooops')
>>> match.group(1), match.group(2)
('12', 'ps')

Regular expressions is as near as you can get to do what you want. There is no way to do it using the same format string ('%d ooo%s').

EDIT: As @Daenyth suggested, you could implement your own function with this behaviour:

import re

def python_scanf(my_str, pattern):
    D = ('%d',      '(\d+?)')
    F = ('%f', '(\d+\.\d+?)')
    S = ('%s',       '(.+?)')
    re_pattern = pattern.replace(*D).replace(*F).replace(*S)
    match = re.match(re_pattern, my_str)
    if match:
        return match.groups()
    raise ValueError("String doesn't match pattern")

Usage:

>>> python_scanf("12 ooops", "%d ooo%s")
('12', 'p')
>>> python_scanf("12 ooops", "%d uuu%s")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 10, in python_scanf
ValueError: String doesn't match pattern

Of course, python_scanf won't work with more complex patterns like %.4f or %r.

juliomalegria
  • 24,229
  • 14
  • 73
  • 89