2

Let's say, I have a string:

  > my_string = '{foo}/{bar}'
  > my_string.format(foo='foo', bar='bar')
  'foo/bar'

Right, cool. But in my case, I want to retrieve which are the keywords arguments in my_string. I have done:

 > ATTRS_PATTERN = re.compile(r'{(?P<variable>[_a-z][_a-z0-9]*)}')
 > ATTRS_PATTERN.findall(my_string)
 ['foo', 'bar']

It's not very sexy. Do you have any better idea ?

vpoulain
  • 740
  • 1
  • 7
  • 18
  • Note that format specifiers can be nested one level: `'{hello:{world}}'.format(hello=5.123456, world='.2f') --> '5.12'`. If you don't keep track of which fields are used inside other fields you might pass in invalid format specifiers which would raise an error if you ever plan to format the string. – Bakuriu Jun 12 '14 at 09:29
  • 1
    You don't want to use `parse` ? https://docs.python.org/2/library/string.html#string.Formatter.parse – user189 Jun 12 '14 at 09:35

2 Answers2

7

Why reinvent the wheel? string.Formatter has the parse() function.

>>> import string
>>> [a[1] for a in string.Formatter().parse('{foo}/{bar}')]
['foo', 'bar']
Burhan Khalid
  • 169,990
  • 18
  • 245
  • 284
Chris Clarke
  • 2,103
  • 2
  • 14
  • 19
1

You can use the string.Formatter.parse method. It splits the string into its literal text components and fields:

In [1]: import string

In [2]: formatter = string.Formatter()

In [3]: text = 'Here is some text with {replacement} fields {}'

In [4]: list(formatter.parse(text))
Out[4]: 
[('Here is some text with ', 'replacement', '', None),
 (' fields ', '', '', None)]

To retrieve the names fields simply iterate over the result and collect the second field.

Note that this will include positional (both numbered and unnumbered) arguments as well.

Note that this does not include nested arguments:

In [1]: import string

In [2]: formatter = string.Formatter()

In [3]: list(formatter.parse('{hello:{world}}'))
Out[3]: [('', 'hello', '{world}', None)]

If you want to get all named fields (assuming only named fields are used), you have to parse the second element in the tuple:

In [4]: def get_named_fields(text):
   ...:     formatter = string.Formatter()
   ...:     elems = formatter.parse(text)
   ...:     for _, field, spec, _ in elems:
   ...:         if field:
   ...:             yield field
   ...:         if spec:
   ...:             yield from get_named_fields(spec)
   ...:             

In [5]: list(get_named_fields('{hello:{world}}'))
Out[5]: ['hello', 'world']

(This solution would allow arbitrarily deep format specifiers, while only one level would be sufficient).

Bakuriu
  • 98,325
  • 22
  • 197
  • 231