4

I'm confusing with a very simple string count operation:

s = 'BANANA'
s.count('ANA')

This should result in 2, right? Since the substring, ANA appears 2 times in BANANA.

But I've got 1 as a result.

>>> s = 'BANANA'
>>> s.count('ANA')
1

No idea why the wrong result. It is such a simple operation!

Appreciate any help.


PS: How can I solve this problem?

Henrique Branco
  • 1,778
  • 1
  • 13
  • 40

4 Answers4

5

string.count() does not count the overlapping occurrences.

If you would like to count overlapped occurrences, a simple loop over the string will count it:

s = 'BANANA'
i = 0
cnt = 0
while True:
    i = s.find('ANA', i)
    if i >= 0:
        i += 1
        cnt += 1
    else:
        break

Alternatively you can use regex too as in @Henrique's answer below.

Ehsan
  • 12,072
  • 2
  • 20
  • 33
4

Good question. But Python string count() doesn't "backtrack". Once it finds the first "ANA", it looks forward through the remaining two letters" "NA".

This kind of "forward search" is the same as most programming languages, for example Java indexOf(), C strstr() and VB.Net InStr()

FoggyDay
  • 11,962
  • 4
  • 34
  • 48
  • Should I use regex instead? – Henrique Branco Apr 21 '20 at 23:29
  • I think a regex might help: you'd a) define the search pattern (e.g. "ANA"), and then b) count the number of returned "groups". But you'd have to experiment - I'm not sure off the top of my head. Please post back what you find :) – FoggyDay Apr 21 '20 at 23:32
4

In 'BANANA', there is only a single complete 'ANA'. Count() is returning 1 because after it finds 'ANA', all that is left is 'NA'.

2

Solved the problem using the new regex library. It has a new parameter overlapped -- extremely useful.

>>> import regex as re
>>> len(re.findall("ANA", "BANANA", overlapped=True))
2

I found the solution at this question here in SO.

wjandrea
  • 28,235
  • 9
  • 60
  • 81
Henrique Branco
  • 1,778
  • 1
  • 13
  • 40
  • You don't really need `regex` for this. `re` can do it with lookaround. See [this answer](/a/48889434/4518341). – wjandrea Jan 26 '23 at 18:46