cost of len() and pep8 suggestion on sequence empty check

Question

If the complexity of python len() is O(1), why does pep8 suggest to use if seq: instead of if len(seq) == 0:

https://wiki.python.org/moin/TimeComplexity
https://www.python.org/dev/peps/pep-0008/#programming-recommendations

Isn't len(seq) == 0 more readable?

For me, something exist if it has content so 'if seq' is clear and simple for me :) — prodev_paris, May 07 '15 at 11:48
possible duplicate of [Best way to check if a list is empty](http://stackoverflow.com/questions/53513/best-way-to-check-if-a-list-is-empty) — tommy.carstensen, Jun 03 '15 at 08:28

score 8 · Answer 1 · answered May 07 '15 at 11:48

8

The former can handle both empty string and None. For example consider these two variables.

>>> s1 = ''
>>> s2 = None

Using the first method

def test(s):
    if s:
        return True
    else:
        return False

>>> test(s1)
False
>>> test(s2)
False

Now using len

def test(s):
    if len(s) == 0:
        return True
    else:
        return False

>>> test(s1)
True
>>> test(s2)
Traceback (most recent call last):
  File "<pyshell#13>", line 1, in <module>
    test(s2)
  File "<pyshell#11>", line 2, in test
    if len(s) == 0:
TypeError: object of type 'NoneType' has no len()

So in terms of performance, both will be O(1), but the truthiness test (the first method) is more robust in that it handles None in addition to empty strings.

answered May 07 '15 at 11:48

Cory Kramer

114,268
16
167
218

The explanation is very interesting. Isn't it a better practice to just check if it is not None before checking the `len()`? I don't like to use non-boolean values inside if/while clauses – SomethingSomething May 07 '15 at 11:52
3

@SomethingSomething Absolutely. In fact the observation in this answer is pretty useless. In real code when you check if a sequence is either empty or not empty is because you then want to perform some operation on it (e.g. indexing, appending, removing etc). Those don't work with `None` anyway rendering this observation useless in 99% of the cases. A remaining 0.99% would be clearer using `seq is/is not None`. – Bakuriu May 07 '15 at 11:54
It depends a bit on your code, but the only reason to have separate checks is if you'll actually treat them separately. So if you do something different when you find None, then the separate test makes sense, but not if None and "" both lead to the same outcome. – SuperBiasedMan May 07 '15 at 11:55
@SomethingSomething but using non-boolean values in conditionals is idiomatic for python. – Peter Wood May 07 '15 at 11:55
@Bakuriu That's not true, there's plenty of cases where you'll want to perform the same action whether you got None or an empty variable. In those cases having two if statements is redundant. – SuperBiasedMan May 07 '15 at 11:56
" I don't like to use non-boolean values inside if/while clauses" : Python objects DO have a boolean value, cf https://docs.python.org/2/reference/datamodel.html#object.__nonzero__ - so `if x:` is just the same as `if bool(x):` (which evals to either `True` or `False`). – bruno desthuilliers May 07 '15 at 11:57
@SuperBiasedMan In my experience that's not the case. An empty collection is the perfect default value for "no values". Only in exceptional cases you want to distinguish between an empty collection and no collection at all and in those cases, since you need the distinction, you *have* to treat them differently (otherwise you could simply avoid using `None`). The point is: if you treat them in the same way you can just get rid of `None` and use `''` or `()` directly. If you treat them differently, then the correct check is doing `is None`. You need the intersection of cases to make this useful. – Bakuriu May 07 '15 at 11:58
@Bakuriu I guess our cases just differ? I've often used it when a function can return an empty value like `[]` or `''` but if there's an issue like a try except block exiting early I end up with a None value instead. For the purposes of checking for a value it doesn't matter if I have empty or None, I don't have the value there. But I do want to preserve the difference so I can later note that the value is None and there was a problem with retrieving it. – SuperBiasedMan May 07 '15 at 12:01

score 5 · Answer 2 · answered May 07 '15 at 11:49

The fact the len is O(1) is only a convention. There may be sequences/collections where len is not O(1) at that point checking len(x) == 0 could take asymptotically more time.

The fact is: the question "is the size of this collection equal to k?" and "is the collection empty?" are fundamentally different problems. The first is answered by len(x) == k and the latter by bool(x) [i.e. is x truthy]. Hence if you want to check if the collection is empty you just use if collection. They just happen to coincide if k == 0.

Generally you should use the most specific term to describe exactly what you are doing. Writing len(x) == 0 may just be a typo for len(x) == 10 with a missing 1 while getting if x wrong is harder.

score 0 · Answer 3 · answered May 07 '15 at 12:42

For if len(seq) == 0, you have to parse that

you are calling a function,
you are checking if its return value is equal to something, and
that "something" is 0.

For if seq, you only have to parse that seq is being evaluated in a Boolean context.

You might argue that you also have to know what type seq has so you know what it means to evaluate it in a Boolean context, but then you also have to do know the type for len(seq) as well, so that you what it means to compute its length. (For example, for a dict d, len(d) says nothing about the values in the dictionary, only how many keys it has.)

cost of len() and pep8 suggestion on sequence empty check

3 Answers3