Trying to learn regular expressions and despite some great posts on here and links to a regEx site, I have a case I was trying to hack out of sheer stubbornness that defied producing the match I was looking for. To understand it, consider the following code which allows us to pass in a list of strings and a pattern and find out if the pattern either matches all items in the list or matches none of them:
import re
def matchNone(pattern, lst):
return not any([re.search(pattern, i) for i in lst])
def matchAll(pattern, lst):
return all([re.search(pattern, i) for i in lst])
To help w/ debugging, this simple code allows us to just add _test
to a function call and see what is being passed to the any()
or all()
functions which ultimately return the result:
def matchAll_test(pattern, lst):
return [re.search(pattern, i) for i in lst]
def matchNone_test(pattern, lst):
return ([re.search(pattern, i) for i in lst])
This pattern and list produces True
from matchAll()
:
wordPattern = "^[cfdrp]an$"
matchAll(wordPattern, ['can', 'fan', 'dan', 'ran', 'pan']) # True
This pattern on the surface appears to work with matchNone()
in our effort to reverse the pattern:
wordPattern = "^[^cfdrp]an|[cfdrp](^an)$"
matchNone(wordPattern, ['can', 'fan', 'dan', 'ran', 'pan']) # True
It returns True
as we hoped it would. But a true reversal of this pattern would return False
for a list of values where none of them are equivalent to our original list ['can', 'fan', 'dan', 'ran', 'pan']
regardless of what else we pass into it. (i.e. "match anything except these 5 words")
In testing to see what changes to the words in this list will get us a False
, we quickly discover the pattern is not as successful as it first appears. If it were, it would fail for matchNone()
on anything not in the aforementioned list.
These permutations helped uncover the short-comings of my pattern tests:
["something unrelated", "p", "xan", "dax", "ccan", "dann", "ra"]
In my exploration of above, I tried other permutations as well taking the original list, using the _test
version of the functions and changing one letter at a time on the original words, and or modifying one term or adding one term from permutations like what is above.
If anyone can find the true inverse of my original pattern, I would love to see it so I can learn from it.
To help with your investigation:
This pattern also works with matchAll()
for all words, but I could not seem to create its inverse either: "^(can|fan|dan|ran|pan)$"
Thanks for any time you expend on this. I'm hoping to find a regEx guru on here who spots the mistake and can propose the right solution.