11

I'm creating a class that renames a file using a user-specified format. This format will be a simple string whose str.format method will be called to fill in the blanks.

It turns out that my procedure will require extracting variable names contained in braces. For example, a string may contain {user}, which should yield user. Of course, there will be several sets of braces in a single string, and I'll need to get the contents of each, in the order in which they appear and output them to a list.

Thus, "{foo}{bar}" should yield ['foo', 'bar'].

I suspect that the easiest way to do this is to use re.split, but I know nothing about regular expressions. Can someone help me out?

Thanks in advance!

Louis Thibault
  • 20,240
  • 25
  • 83
  • 152
  • In case you know all possible variables *beforehand*, you can just pass them all to `str.format` - it will ignore those not in pattern. `'{user}_{bar}'.format(user='Mike', foo=1, bar=2)` will output `Mike_2`. I happend to have allowed vars fixed in a dict, so I could skip looking for vars in pattern. Anyway knowing about `string.Formatter()` is useful. – yentsun Mar 11 '13 at 10:10

2 Answers2

63

Another possibility is to use Python's actual Formatter itself to extract the field names for you:

>>> import string
>>> s = "{foo} spam eggs {bar}"
>>> string.Formatter().parse(s)
<formatteriterator object at 0x101d17b98>
>>> list(string.Formatter().parse(s))
[('', 'foo', '', None), (' spam eggs ', 'bar', '', None)]
>>> field_names = [name for text, name, spec, conv in string.Formatter().parse(s)]
>>> field_names
['foo', 'bar']

or (shorter but less informative):

>>> field_names = [v[1] for v in string.Formatter().parse(s)]
>>> field_names
['foo', 'bar']
DSM
  • 342,061
  • 65
  • 592
  • 494
18

Using re.findall():

In [5]: import re

In [8]: strs = "{foo} spam eggs {bar}"

In [9]: re.findall(r"{(\w+)}", strs)
Out[9]: ['foo', 'bar']
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
  • Just a quick question. Are the results from `re.findall` guaranteed to be listed in the same order as they appear in the string? – Louis Thibault Dec 27 '12 at 21:53
  • 2
    @blz yes, as the string is parsed from left to right. – Ashwini Chaudhary Dec 27 '12 at 21:56
  • Beware, this does not account for format specifiers such as `{spam:3f}`. @DSM's answer should be the accepted one. Modifying the `\w` to include more characters until it matches the full spec of `str.format` could work, but using the formatter itself is better (and not prone to breakage if the syntax evolves) – ewen-lbh Apr 18 '21 at 08:55