I have a long string (e.g. AAAABBBBCCCC) and I eventually want to find all overlapping occurrences for each member of a list of different substrings (e.g. ['AAA', 'AAB', 'ABB', 'BBB']).
I found a very helpful suggestion on a previous StackOverflow posting - string count with overlapping occurrences However, using this I can't seem to assign the substrings in such a way that re.findall() can recognize them. It's probably something stupid, but I just can't seem to figure it out. It seems like the ?
is doing something different than usual...
>>> string = 'AAAABBBBCCCC'
>>> len(re.findall('(?=AAA)', string))
2
>>> substring = 'AAA'
>>> len(re.findall('(?=substring)', string))
0
>>> substring = "'(?=AAA)'"
>>> len(re.findall(substring, string))
0
>>> #This works, but is not overlapping:
>>> substring = 'AAA'
>>> len(re.findall(substring, string))
1
I would appreciate any suggestions! Thanks!