Find keyword arguments from a c-style python string and text

Question

How can I find the keyword argument passed to a c-style python string

Given:

bill eats apple
'%(name)s eats %(fruit)s'

Should get

{ 'name': 'bill', 'fruit' : 'apple'}

You may want to give [the `parse` module](https://pypi.python.org/pypi/parse) a try. — user94559, Oct 04 '17 at 06:26
ok, so nothing direct?, I would have to use `re` and stuff only. — garg10may, Oct 04 '17 at 06:28
I don't understand your question. Do you mean you want to try to only use modules that ship with Python? If so, why? Would you consider it acceptable to copy/paste the code from the `parse` module? (If not, why not? If so, why wouldn't you just use the module?) — user94559, Oct 04 '17 at 06:30
I am very happy with the module you suggested, as first it seemed a simple task to me, so I thought is that really that hard, can't I come up with a line or two of code myself. As i understand, it's not that simple/direct, given strings can grow complex. — garg10may, Oct 04 '17 at 06:36
not able to do it with parse module also, only seems to be working with `{}` type parameters. — garg10may, Oct 04 '17 at 06:59

Arount · Answer 1 · 2017-10-04T08:29:13.533

1

First, there is no function or package in Python that allow you to do that with old style (aka C style) string formatting. A good reference about reversing c-style string format.

The best you can have is a giant regex pattern and as you know it's really not a perfect solution.

That said,

As @smarx said in comments, you can use parse which is well fitted for that, but, from the given doc's link:

parse() is the opposite of format()

That mean you needs to use format() instead of %, which is a good thing because % is Python's string formatting old style where format() is the new style and the best to use since Python3 (it's python 2.7 / 3 compliant, but not %).

Here is an example with format():

print(parse.parse('{name} eats {fruit}', 'bill eats apple'))
<Result () {'fruit': 'apple', 'name': 'bill'}>

If you are not confortable with format() I advise you to give a look at pyformat.org, a really good guide.

edited Oct 04 '17 at 08:29

answered Oct 04 '17 at 08:11

Arount

9,853
1
30
43

but my strings don't use format, would mean changing existing texts. – garg10may Oct 04 '17 at 08:12
I updated my answer, from what I saw you have two solutions: 1. Use `format()` or use homemade regex pattern and hope it will works in most of case. Again, even if that needs to change texts I advise you to use `format()` – Arount Oct 04 '17 at 08:30
@garg10may Can't you just `replace` `%(` with `{` and `)s` with `}` prior to using `parse`? – tobias_k Oct 04 '17 at 08:44

tobias_k · Accepted Answer · 2017-10-04T10:04:21.727

1

If you do not want to use parse, you can convert your pattern string to a regular expression using named groups and then use re.match and match.groupdict to get the mapping.

>>> text = "bill eats apple"
>>> a = "%(name)s eats %(fruit)s"
>>> p = re.sub(r"%\((\w+)\)s", r"(?P<\1>\w+)", a)
>>> p
'(?P<name>\\w+) eats (?P<fruit>\\w+)'
>>> re.match(p, text).groupdict()
{'fruit': 'apple', 'name': 'bill'}

Note that \w+ will only match a single word. To allow for more complex names, you might instead use e.g. [^(]+ to match anything up to the closing )

>>> text = "billy bob bobbins eats a juicy apple"
>>> p = re.sub(r"%\((\w+)\)s", r"(?P<\1>[^)]+)", a)
>>> re.match(p, text).groupdict()
{'fruit': 'a juicy apple', 'name': 'billy bob bobbins'}

edited Oct 04 '17 at 10:04

answered Oct 04 '17 at 08:53

tobias_k

81,265
12
120
179

this doens't work when `name` has a space in between. – garg10may Oct 04 '17 at 09:56
@garg10may That's right, in this case you will have to replace `\w+` with something more complex, maybe `[^)]+` or similar. – tobias_k Oct 04 '17 at 09:57
that gives `sre_constants.error: unbalanced parenthesis` – garg10may Oct 04 '17 at 10:02
thx, not sure how it works but your `re` works magic. Using parse would have required changing everything given I was using django translations it would be more cubersome. And replacing `%' prior to using parse would also not be possible since now the translations won't work. – garg10may Oct 04 '17 at 10:57

Find keyword arguments from a c-style python string and text

2 Answers2