I am dealing with some string search tasks just to improve an efficient way of searching. I am trying to implement a way of counting how many substrings there are in a given set of strings by using backward search. For example given the following strings:
original = 'panamabananas$'
s = smnpbnnaaaaa$a
s1 = $aaaaaabmnnnps #sorted version of s
I am trying to find how many times the substring 'ban' it occurs. For doing so I was thinking in iterate through both strings with zip function
. In the backward search, I should first look for the last character of ban
(n
) in s1
and see where it matches with the next character a
in s
. It matches in indexes 9,10 and 11, which actually are the third, fourth and fifth a in s
. The next character to look for is b
but only for the matches that occurred before (This means, where n
in s1
matched with a
in s
). So we took those a
(third, fourth and fifth) from s
and see if any of those third, fourth or fifth a
in s1
match with any b
in s. This way we would have found an occurrence of 'ban'.
It seems complex to me to iterate and save cuasi-occurences so what I was trying is something like this:
n = 0 #counter of occurences
for i, j in zip(s1, s):
if i == 'n' and j == 'a': # this should save the match
if i[3:6] == 'a' and any(j[3:6] == 'b'):
n += 1
I think nested if statements may be needed but I am still a beginner. Because I am getting 0 occurrences when there are one ban occurrences in the original.