I want to enumerate all possible strings matched by a regular expression.
All the regular expressions I want to match against have no *
or +
only something like x*{5}
which is equivalent to x?x?x?x?x?
.
so given any regular expression like the one below:
[a-c]?cdr*{0,2}
i want all strings matching the expression. Thus the library or program shall output something like this:
cd, acd, bcd, ccd, cdr, acdr, bcdr, ccdr, cdrr, acdrr, bcdrr, ccdrr
I don't care about the language it is implemented in as long as it runs in linux.
refinement: if the regular expression is transformed into a deterministic finite automaton the automaton must be representable as a directed acyclic graph. that's why the possible output strings must be enumerable (not infinitely long strings).