1

How can I find the keyword argument passed to a c-style python string

Given:

bill eats apple
'%(name)s eats %(fruit)s'

Should get

{ 'name': 'bill', 'fruit' : 'apple'}
garg10may
  • 5,794
  • 11
  • 50
  • 91
  • You may want to give [the `parse` module](https://pypi.python.org/pypi/parse) a try. – user94559 Oct 04 '17 at 06:26
  • ok, so nothing direct?, I would have to use `re` and stuff only. – garg10may Oct 04 '17 at 06:28
  • I don't understand your question. Do you mean you want to try to only use modules that ship with Python? If so, why? Would you consider it acceptable to copy/paste the code from the `parse` module? (If not, why not? If so, why wouldn't you just use the module?) – user94559 Oct 04 '17 at 06:30
  • I am very happy with the module you suggested, as first it seemed a simple task to me, so I thought is that really that hard, can't I come up with a line or two of code myself. As i understand, it's not that simple/direct, given strings can grow complex. – garg10may Oct 04 '17 at 06:36
  • not able to do it with parse module also, only seems to be working with `{}` type parameters. – garg10may Oct 04 '17 at 06:59

2 Answers2

1

First, there is no function or package in Python that allow you to do that with old style (aka C style) string formatting. A good reference about reversing c-style string format.

The best you can have is a giant regex pattern and as you know it's really not a perfect solution.


That said,

As @smarx said in comments, you can use parse which is well fitted for that, but, from the given doc's link:

parse() is the opposite of format()

That mean you needs to use format() instead of %, which is a good thing because % is Python's string formatting old style where format() is the new style and the best to use since Python3 (it's python 2.7 / 3 compliant, but not %).

Here is an example with format():

print(parse.parse('{name} eats {fruit}', 'bill eats apple'))
<Result () {'fruit': 'apple', 'name': 'bill'}>

If you are not confortable with format() I advise you to give a look at pyformat.org, a really good guide.

Arount
  • 9,853
  • 1
  • 30
  • 43
  • but my strings don't use format, would mean changing existing texts. – garg10may Oct 04 '17 at 08:12
  • I updated my answer, from what I saw you have two solutions: 1. Use `format()` or use homemade regex pattern and hope it will works in most of case. Again, even if that needs to change texts I advise you to use `format()` – Arount Oct 04 '17 at 08:30
  • @garg10may Can't you just `replace` `%(` with `{` and `)s` with `}` prior to using `parse`? – tobias_k Oct 04 '17 at 08:44
1

If you do not want to use parse, you can convert your pattern string to a regular expression using named groups and then use re.match and match.groupdict to get the mapping.

>>> text = "bill eats apple"
>>> a = "%(name)s eats %(fruit)s"
>>> p = re.sub(r"%\((\w+)\)s", r"(?P<\1>\w+)", a)
>>> p
'(?P<name>\\w+) eats (?P<fruit>\\w+)'
>>> re.match(p, text).groupdict()
{'fruit': 'apple', 'name': 'bill'}

Note that \w+ will only match a single word. To allow for more complex names, you might instead use e.g. [^(]+ to match anything up to the closing )

>>> text = "billy bob bobbins eats a juicy apple"
>>> p = re.sub(r"%\((\w+)\)s", r"(?P<\1>[^)]+)", a)
>>> re.match(p, text).groupdict()
{'fruit': 'a juicy apple', 'name': 'billy bob bobbins'}
tobias_k
  • 81,265
  • 12
  • 120
  • 179
  • this doens't work when `name` has a space in between. – garg10may Oct 04 '17 at 09:56
  • @garg10may That's right, in this case you will have to replace `\w+` with something more complex, maybe `[^)]+` or similar. – tobias_k Oct 04 '17 at 09:57
  • that gives `sre_constants.error: unbalanced parenthesis` – garg10may Oct 04 '17 at 10:02
  • thx, not sure how it works but your `re` works magic. Using parse would have required changing everything given I was using django translations it would be more cubersome. And replacing `%' prior to using parse would also not be possible since now the translations won't work. – garg10may Oct 04 '17 at 10:57