I would like to compare a string A
with a regex R
.
A = u'Hi my friend, my name is Julio'
R = r'Hi\s+my\s+friend,\s+my\s+name\s+is([A-Za-z]+)'
At this time I can easily know if the syntax is good thanks to re.match
and re.search
. Now I would like to study the differences between A and B when the match doesn't work.
My first case is simple. I replace the regex ([A-Za-z]+)
with (.+)
to know if the issue is just in the regex group matching. In this case, I can easily raise the issue by saying that the string syntax is good expecting for the group defined for the name.
Now in the case that step 1 and step 2 are failed, I would like to make a diff like HTML diff
but with a regex to identify where the regex failed.
I studied difflib
and the find_longest_match
function but it seems that this function works only character per character and not on a sub string.
Do you have any idea/suggestion to identify the diff based on a regex comparison and potentially compute the ratio measuring the similarity?