Finding number of times a substring exists in a string - Python

Question

I am trying to find the # of times a substring, in this case 'bob' appears in a string. My solution works for some strings, but not for all. For example, the answer to the following should be 7, but I am returning 5.

Any ideas why?

Thanks

s = 'bobbisbobobugbobobbobbobo'
print('Number of times bob occurs is: ', s.count('bob'))

`count` counts the non-overlapping matches. That's why it's less than what you see in `s`. — Arda Arslan, Sep 05 '17 at 02:48
`sum('bob' == s[i:i+len('bob')] for i in range(len(s)-(len('bob')-1)))` — dawg, Sep 05 '17 at 03:17

score 4 · Answer 1 · edited Sep 05 '17 at 03:01

4

The problem is that s.count() returns the number of non-overlapping occurrences of substring sub in the range [start, end].

To count overlapping strings use regex

import re

text = 'bobbisbobobugbobobbobbobo'
print(len(re.findall('(?=bob)', text)))

edited Sep 05 '17 at 03:01

Taku

31,927
11
74
85

answered Sep 05 '17 at 02:49

Lucas Hendren

2,786
2
18
33

Thanks, buddy could you please tell us one thing that what is the difference between string.count and re.findall, since the count function returns the different value and the findall function returns the different value, Furthermore in your code, you passed substring in this format '(?=bob)'.could you please explain it to us. Why you passed the sub string in this format I mean what is inner logic could you explain it – Adnan May 01 '20 at 10:03

score 1 · Answer 2 · answered Sep 05 '17 at 02:56

Your solution does not work because str.count does not count overlapping matches.

Despite there’s plenty other solutions, another possible way to do this is to use the advanced regex module:

import regex as re
s = 'bobbisbobobugbobobbobbobo'
print(len(re.findall("bob", s, overlapped=True)))

# 7

score 0 · Accepted Answer · answered Sep 05 '17 at 02:51

You seem to want overlapping counts. str.count will not get you there, unfortunately, because it does not overlap substring searches. Try slicing and counting.

Here's a solution with a collections.Counter though it can be done just about any other way as long as you slice it right.

from collections import Counter

text = 'bobbisbobobugbobobbobbobo'
term = 'bob'
c = Counter([text[i : i + len(term)] for i in range(len(text))])
print(c[term])

Finding number of times a substring exists in a string - Python

3 Answers3