I am looking for a solution to match a single string against a set of wildcard strings. For example
>>> match("ab", ["a*", "b*", "*", "c", "*b"])
["a*", "*", "*b"]
The order of the output is of no importance.
I will have in the order of 10^4 wildcard strings to match against and I will do around ~10^9 match calls. This means I will probably have to rewrite my code like so:
>>> matcher = prepare(["a*", "b*", "*", "c", "*b"]
>>> for line in lines: yield matcher.match("ab")
["a*", "*", "*b"]
I've started writing a trie implementation in Python that handles wildcards and I just need to get those corner cases right. Despite this I am curious to hear; How would you solve this? Are there any Python libraries out there that make me solve this faster?
Some insights so far:
- Named (Python, re) regular expressions will not help me here since they'll only return one match.
- pyparsing seems like an awesome library, but is sparsely documented and does not, as I see it, support matching multiple patterns.