30

I want to provide automatic string formatting in an API such that:

my_api("path/to/{self.category}/{self.name}", ...)

can be replaced with the values of attributes called out in the formatting string.


How do I extract the keyword arguments from a Python format string:

"non-keyword {keyword1} {{escaped brackets}} {} {keyword2}" => 'keyword1', 'keyword2'
Jace Browning
  • 11,699
  • 10
  • 66
  • 90

4 Answers4

67

You can use the string.Formatter() class to parse out the fields in a string, with the Formatter.parse() method:

from string import Formatter

fieldnames = [fname for _, fname, _, _ in Formatter().parse(yourstring) if fname]

Demo:

>>> from string import Formatter
>>> yourstring = "path/to/{self.category}/{self.name}"
>>> [fname for _, fname, _, _ in Formatter().parse(yourstring) if fname]
['self.category', 'self.name']
>>> yourstring = "non-keyword {keyword1} {{escaped brackets}} {} {keyword2}"
>>> [fname for _, fname, _, _ in Formatter().parse(yourstring) if fname]
['keyword1', 'keyword2']

You can parse those field names further; for that you can use the str._formatter_field_name_split() method (Python 2) / _string.formatter_field_name_split() function (Python 3) (this internal implementation detail is not otherwise exposed; Formatter.get_field() uses it internally). This function returns the first part of the name, the one that'd be looked up on in the arguments passed to str.format(), plus a generator for the rest of the field.

The generator yields (is_attribute, name) tuples; is_attribute is true if the next name is to be treated as an attribute, false if it is an item to look up with obj[name]:

try:
    # Python 3
    from _string import formatter_field_name_split
except ImportError:
    formatter_field_name_split = str._formatter_field_name_split
from string import Formatter

field_references = {formatter_field_name_split(fname)[0]
 for _, fname, _, _ in Formatter().parse(yourstring) if fname}

Demo:

>>> from string import Formatter
>>> from _string import formatter_field_name_split
>>> yourstring = "path/to/{self.category}/{self.name}"
>>> {formatter_field_name_split(fname)[0]
...  for _, fname, _, _ in Formatter().parse(yourstring) if fname}
{'self'}

Take into account that this function is part of the internal implementation details of the Formatter() class and can be changed or removed from Python without notice, and may not even be available in other Python implementations.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Just curious, Martijn, would *you* use a simple str.replace() or re.sub() (or something entirely different) to generate the new string? – mtik00 Sep 23 '14 at 14:04
  • @mtik00: depends on the use case; sometimes a `str.replace()` is just what is called for. – Martijn Pieters Sep 23 '14 at 14:06
  • Thanks for your input! In this specific case, I would have used `str.replace()` in a loop; nice and simple. – mtik00 Sep 23 '14 at 14:11
  • 1
    @mtik00: this is not a simple replace job however; the `str.format()` format allows for *nested* placeholders too, for example. – Martijn Pieters Sep 23 '14 at 14:13
  • When does the parse() function return a None as field? I'm not able to understand the documentation :( – Moberg Jan 21 '17 at 00:50
  • @Moberg: `None` is output when there are `{{` or `}}` escape braces, because at that point the parser stops to emit information but there is only a literal text value, there's no field to name or provide formatting info or a conversion function for. – Martijn Pieters Jan 21 '17 at 12:25
  • Great answer! my humble contribution - if you want to prevent field-name duplicates, do use the python `set` syntax like so: `{fname for _, fname, _, _ in Formatter().parse(yourstring) if fname}` – noamgot Apr 16 '18 at 21:10
  • @PlasmaBinturong: no, there isn't. – Martijn Pieters Feb 15 '19 at 14:56
5

Building off Martijn's answer, an easier format for the comprehensive list that I've used is:

>>> yourstring = "path/to/{self.category}/{self.name}"
>>> [x[1] for x in yourstring._formatter_parser() if x[1]]
['self.category', 'self.name']

It's functionally exactly the same, just much easier to digest.

MacMcIrish
  • 193
  • 1
  • 7
  • 1
    I can't find any documentation for `_formatter_parser()`! is it an alias for `Formatter.parse()`? – heri0n Aug 08 '19 at 20:19
  • It may as well be afaik, here is the relevant section for `Formatter.parse()` https://github.com/python/cpython/blob/2085bd0877e17ad4d98a4586d5eabb6faecbb190/Lib/string.py#L260-L261, as well as the relevant section for `_formatter_parser()` itself: https://github.com/python/cpython/blob/c4cacc8c5eab50db8da3140353596f38a01115ca/Objects/stringlib/unicode_format.h#L1100-L1128 – MacMcIrish Aug 09 '19 at 21:27
  • As of Python 3.11.2, `_formatter_parser()` is not longer a method of str. You can replace it with `Formatter().parse(yourstring)`. – krumpelstiltskin Jun 28 '23 at 13:09
3

If all placeholders are named, a special dictionary could be used to intercept which keys are tried to be accessed and logged to an array.

def format_keys(str_):
    class HelperDict(dict):
        def __init__(self):
            self._keys = []
        def __getitem__(self, key):
            self._keys.append(key)    
    d = HelperDict()
    str_.format_map(d)
    return d._keys

Note that if there are unnamed placeholders, an IndexError will be raised by .format() (tuple index out of range).

wim
  • 338,267
  • 99
  • 616
  • 750
CodeManX
  • 11,159
  • 5
  • 49
  • 70
  • A fun idea but it needs some polish. e.g. the `None` returned by `__getitem__` will cause unhandled exception from a format string like `'My {foo} is {bar:03d}'`. – wim Jan 26 '18 at 20:42
0

You can do "path/to/{self.category}/{self.name}".format(self=self). You could thus work with those kwargs in __getattr__.

pacholik
  • 8,607
  • 9
  • 43
  • 55