1

I am trying to find the # of times a substring, in this case 'bob' appears in a string. My solution works for some strings, but not for all. For example, the answer to the following should be 7, but I am returning 5.

Any ideas why?

Thanks

s = 'bobbisbobobugbobobbobbobo'
print('Number of times bob occurs is: ', s.count('bob'))
JD2775
  • 3,658
  • 7
  • 30
  • 52

3 Answers3

4

The problem is that s.count() returns the number of non-overlapping occurrences of substring sub in the range [start, end].

To count overlapping strings use regex

import re

text = 'bobbisbobobugbobobbobbobo'
print(len(re.findall('(?=bob)', text)))
Taku
  • 31,927
  • 11
  • 74
  • 85
Lucas Hendren
  • 2,786
  • 2
  • 18
  • 33
  • Thanks, buddy could you please tell us one thing that what is the difference between string.count and re.findall, since the count function returns the different value and the findall function returns the different value, Furthermore in your code, you passed substring in this format '(?=bob)'.could you please explain it to us. Why you passed the sub string in this format I mean what is inner logic could you explain it – Adnan May 01 '20 at 10:03
1

Your solution does not work because str.count does not count overlapping matches.

Despite there’s plenty other solutions, another possible way to do this is to use the advanced regex module:

import regex as re
s = 'bobbisbobobugbobobbobbobo'
print(len(re.findall("bob", s, overlapped=True)))

# 7
Taku
  • 31,927
  • 11
  • 74
  • 85
0

You seem to want overlapping counts. str.count will not get you there, unfortunately, because it does not overlap substring searches. Try slicing and counting.

Here's a solution with a collections.Counter though it can be done just about any other way as long as you slice it right.

from collections import Counter

text = 'bobbisbobobugbobobbobbobo'
term = 'bob'
c = Counter([text[i : i + len(term)] for i in range(len(text))])
print(c[term])

7
cs95
  • 379,657
  • 97
  • 704
  • 746