I'd like to parse a list of regular expressions to calculate the likelihood of each to find a match to it in a certain text/string...
Eg. finding '[AB]
' in a string of length 1 should be something around 1/13 (considering only captial letters).
Is there a generic regex parser, which returns the individual positions/alternatives?
I'm thinking of getting a list of positions as return ('[AB].A{2}
' would yield '[['A','B'],'.',['AA']
')
The problem is the parsing of regular expressions with pyparsing.
Simple regexes are no problem, but when it comes to "alternatives" and repetitions, I'm lost: I find it hard to parse nested expressions like '((A[AB])|(AB))
'.
Any thoughts?