14

Is there any case where len(someObj) does not call someObj's __len__ function?

I recently replaced the former with the latter in a (sucessful) effort to speed up some code. I want to make sure there's not some edge case somewhere where len(someObj) is not the same as someObj.__len__().

wjandrea
  • 28,235
  • 9
  • 60
  • 81
David Locke
  • 17,926
  • 9
  • 33
  • 53

5 Answers5

17

If __len__ returns a length over sys.maxsize, len() will raise an exception. This isn't true of calling __len__ directly. (In fact you could return any object from __len__ which won't be caught unless it goes through len().)

Benjamin Peterson
  • 19,297
  • 6
  • 32
  • 39
  • 3
    Note that, since `len` is supposed to be return the number of elements in a collection, returning something bigger than `sys.maxsize` is almost certainly nonsense. – Mike Graham Mar 20 '10 at 01:20
  • 2
    @Mike In theory you could have an object like Python 3's `range` that doesn't store all its elements in memory and calculates its `__len__` using math. `range.__len__` itself raises an error in that situation: `range(sys.maxsize+1).__len__()` gives `OverflowError: Python int too large to convert to C ssize_t` – wjandrea Jan 05 '20 at 20:25
  • 2
    Note that the above is only true in 2.x. In 3.6, for example, I get `len(range(1000000000000))` -> `1000000000000`, and (worryingly) `range(1000000000000).__len__()` -> `-727379968`. Although this result still shows why you shouldn't call `__len__` yourself! – Karl Knechtel Jul 04 '20 at 18:42
  • `len` also raises an exception if the value returned by `__len__` is negative, or not an `int`. – kaya3 Apr 27 '23 at 17:13
11

What kind of speedup did you see? I cannot imagine it was noticeable was it?

From http://mail.python.org/pipermail/python-list/2002-May/147079.html

in certain situations there is no difference, but using len() is preferred for a couple reasons.

first, it's not recommended to go calling the __methods__ yourself, they are meant to be used by other parts of python.

len() will work on any type of sequence object (lists, tuples, and all). __len__ will only work on class instances with a __len__ method.

len() will return a more appropriate exception on objects without length.

Crescent Fresh
  • 115,249
  • 25
  • 154
  • 140
  • It was about half a second on a program that ran for one minute. It's probably because I called len 2,443,519 times. As I was writing the question I realized that I should probably reduce the number of times I'm calling len. – David Locke Jan 30 '09 at 16:16
  • @David: Yeah you missed mentioning the 2,443,519 part. Holy hell ;) – Crescent Fresh Jan 30 '09 at 16:28
  • I personally wouldn't consider the extra 1/120th speedup to make it worth the code ugliness, but that's your call. – Eli Courtwright Jan 30 '09 at 16:50
  • @Eli, Normally I would agree with you. In this case I'm trying to benchmark the same problem in multiple languages. – David Locke Jan 30 '09 at 18:57
  • 1
    FYI: I was able to remove 2,363,276 of those calls to len and that sped things up by another second and a half. – David Locke Feb 04 '09 at 22:01
  • 1
    @DavidLocke: I would think a benchmark based on idiomatic code for each language would be much more useful than one based on bent and mangled code. – Ethan Furman Apr 30 '14 at 19:49
2

I think the answer is that it will always work -- according to the Python docs:

__len__(self):

Called to implement the built-in function len(). Should return the length of the object, an integer >= 0. Also, an object that doesn't define a __nonzero__() method and whose __len__() method returns zero is considered to be false in a Boolean context.

Eli Courtwright
  • 186,300
  • 67
  • 213
  • 256
Andrew Jaffe
  • 26,554
  • 4
  • 50
  • 59
2

There are cases where len(someObj) is not the same as someObj.__len__() since len() validates __len__()'s return value. Here are the possible errors in Python 3.6.9:

  • Too low, i.e. less than 0

    ValueError: __len__() should return >= 0
    
  • Too high, i.e. greater than sys.maxsize (CPython-specific, per the docs)

    OverflowError: cannot fit 'int' into an index-sized integer
    
  • An invalid type, e.g float

    TypeError: 'float' object cannot be interpreted as an integer
    
  • Missing, e.g. len(object)

    TypeError: object of type 'type' has no len()
    

    I mention this because object.__len__() raises a different exception, AttributeError.

It's also worth noting that range(sys.maxsize+1) is valid, but its __len__() raises an exception:

OverflowError: Python int too large to convert to C ssize_t
wjandrea
  • 28,235
  • 9
  • 60
  • 81
-4

According to Mark Pilgrim, it looks like no. len(someObj) is the same as someObj.__len__();

Cheers!

brettkelly
  • 27,655
  • 8
  • 56
  • 72