0

I wrote a method in which I look for all pattern matches in a string.

def patternSearchResultShow(text, pattern, counter, positions):
    print('Text = \'' + text + '\'')
    print('Pattern = \'' + pattern + '\'')
    print('Number of pattern occurrences: ' + str(counter))
    print('The template enters into positions: ' + str(positions))
    
def patternSearch(text, pattern):
    counter = 0
    positions = []
    for i in range(len(text) - 1):
        for j in range(len(pattern)):
            if (text[i + j] != pattern[j]):
                break
        if (j == len(pattern) - 1):
            positions.append(i)
            counter = counter + 1
            
    patternSearchResultShow(text, pattern, counter, positions)

print('My pattern search:')
text = 'aaaa'
pattern = 'a'
patternSearch(text, pattern)

First test:
text = 'aaaa'
pattern = 'a' My output:

Number of pattern occurrences: 3
The template enters into positions: [0, 1, 2]

Expected output:

Number of pattern occurrences: 4
The template enters into positions: [0, 1, 2, 3]

Ok, to get the above expected result, I changed the following line of code:
for i in range(len(text) - 1): -> for i in range(len(text)):
Now I have output:

Number of pattern occurrences: 4
The template enters into positions: [0, 1, 2, 3]

Second test: text = 'aaaa'
pattern = 'aa' My output:

IndexError: string index out of range
np.
  • 109
  • 1
  • 11

1 Answers1

2

You're accessing outside text when i+j gets too large, that causes the IndexError.

The reason why you're not finding all the matches is because you limit the range to len(text)-1, so you don't test the last character.

You should make the limit of the outer loop depend on the length of pattern

for i in range(len(text)-len(pattern)+1):

You can also replace the inner loop with a slice comparison.

if text[i:i+len(pattern)] == pattern:
    positions.append(i)
    counter += 1
Barmar
  • 741,623
  • 53
  • 500
  • 612
  • what means ...i:i....? it is range of symbols? for example text = 'abcd', then text[1:3] = 'bc'? – np. Jan 13 '22 at 11:05
  • 1
    Yes. This is Python's slice notation. See https://stackoverflow.com/questions/509211/understanding-slice-notation – Barmar Jan 13 '22 at 15:09