12

This is a short one, yet very irritating. I know I can count the amount of times a string occurs within another string like this:

'banana'.count('a')
>>>3

meaning that banana contains the letter "a" 3 times.

This is where it gets kind of weird.

My first confusion is - when I do 'foo'.count(''), what does Python look for?

is '' == None == anything?

It doesn't seem to be the case, but then again, what IS '' logically speaking? And more importantly, why does

'test'.count('')
>>>5

return one more than the length of the string?

What the hell is included in a string that's always 1 higher than the amount of letters? the void?

EDIT: the ' character twice looks like one " character. I am talking about two times ' here, to avoid confusion

EDIT2: There seems to be some confusion about how the amount of '' happen. Refer to comments below.

pushkin
  • 9,575
  • 15
  • 51
  • 95
Flying Thunder
  • 890
  • 2
  • 11
  • 37

2 Answers2

27

Every string1 can be thought of as:

any_string = "" + "".join(any_string) + ""

which contains exactly len(any_string) + 1 instances of ''.


For "foo" for example, it would be:

"" + "f" + "" + "o" + "" + "o"+ ""
#    |----- from join -------|

As it can be seen there are 4 instances of "" in it.


Note however, that this is a problem where no answer or all answers could somehow support a case for themselves. It get's philosophical:

  • How much nothing is contained in nothing?
  • How much nothing is contained in something?

This answer tries to explain the convention used by Python and does not intend to suggest that this is the way all languages do it \ should be doing it; it is just how Python does it.


1Empty strings are an exception and are handled differently; they simply return 1; which is yet another convention.

Ma0
  • 15,057
  • 4
  • 35
  • 65
  • 2
    Personally I don't think this logically follows entirely (although others might disagree), read the comments under this answer https://stackoverflow.com/questions/40192449/why-are-str-count-and-lenstr-giving-different-output/40192499#40192499 I agree with Sven who says "Infinitely often is just as valid an answer as string length plus one" – Chris_Rands May 08 '18 at 14:34
  • That makes sense, i was just keeping it because Qback suggested a different solution as far as i understood it, where count('') counts all letters and a possible "ending Null" - edit:nevermind that, got deleted. alright, i guess that makes sense - i wasnt that far off with my "counting the void" idea then i guess 8) – Flying Thunder May 08 '18 at 14:36
  • @Chris_Rands You are right; it is a matter of convention. But it is the convention that is explained here; i'll try to make that clear(er). – Ma0 May 08 '18 at 14:37
7
str.count(sub)

Counts the number of occurrences of sub in str.

Since strings are sequences, it basically counts the number of splits sub would cause in str.

An empty string is at the beginning, between each character, and at the end.

Hence, why when you use 'test', which has a len of 4, you get 5 occurrences of sub ('').

Graipher
  • 6,891
  • 27
  • 47
Matt_G
  • 506
  • 4
  • 14