There are quite a few questions on SO asking about how to parse a regex pattern and output all possible matches to that pattern. For some reason, though, every single one of them I can find (1, 2, 3, 4, 5, 6, 7, probably more) are either for Java or some variety of C (and just one for JavaScript), and I currently need to do this in PHP.
I’ve Googled to my heart’s (dis)content, but whatever I do, pretty much the only thing that Google gives me is links to the docs for preg_match()
and pages about how to use regex, which is the opposite of what I want here.
My regex patterns are all very simple and guaranteed to be finite; the only syntax used is:
[]
for character classes()
for subgroups (capturing not required)|
(pipe) for alternative matches within subgroups?
for zero-or-one matches
So an example might be [ct]hun(k|der)(s|ed|ing)?
to match all possible forms of the verbs chunk, thunk, chunder and thunder, for a total of sixteen permutations.
Ideally, there’d be a library or tool for PHP which will iterate through (finite) regex patterns and output all possible matches, all ready to go. Does anyone know if such a library/tool already exists?
If not, what is an optimised way to approach making one? This answer for JavaScript is the closest I’ve been able to find to something I should be able to adapt, but unfortunately I just can’t wrap my head around how it actually works, which makes adapting it more tricky. Plus there may well be better ways of doing it in PHP anyway. Some logical pointers as to how the task would best be broken down would be greatly appreciated.
Edit: Since apparently it wasn’t clear how this would look in practice, I am looking for something that will allow this type of input:
$possibleMatches = parseRegexPattern('[ct]hun(k|der)(s|ed|ing)?');
– and printing $possibleMatches
should then give something like this (the order of the elements is not important in my case):
Array
(
[0] => chunk
[1] => thunk
[2] => chunks
[3] => thunks
[4] => chunked
[5] => thunked
[6] => chunking
[7] => thunking
[8] => chunder
[9] => thunder
[10] => chunders
[11] => thunders
[12] => chundered
[13] => thundered
[14] => chundering
[15] => thundering
)