I answered a question the other day about finding the strings that occur between two specified characters. I ended up with this fairly basic regular expression:
>>> import re
>>> def smallest_between_two(a, b, text):
... return re.findall(re.escape(a) + "(.*?)" + re.escape(b), text)
...
>>> smallest_between_two(' ', '(', 'def test()')
['test']
>>> smallest_between_two('[', ']', '[this one][this one too]')
['this one', 'this one too']
>>> smallest_between_two('paste ', '/', '@paste "game_01/01"')
['"game_01']
However, when I went to look over it again, I realized that there was a common error that could occur when a match was partially contained inside of another match. Here is an example:
>>> smallest_between_two(' ', '(', 'here is an example()')
['is an example']
I am unsure of why it is not also finding an example
, and example
, as both of those also occur between a ' '
and a '('
I would rather not do this to find additional matches:
>>> first_iteration = smallest_between_two(' ', '(', 'here is an example()')
>>> smallest_between_two(' ', '(', first_iteration[0] + '(')
['an example']