In python 2, the built in function map
seems to call the __len__
when length is overwritten. Is that correct - if so, why are we computing the length of the iterable to map? Iterables don't need to have length overwritten (e.g.), and the map function works even when length is not pre-defined by the iterable.
Map is defined here; it does specify that there is length-dependent functionality in the event that multiple iterables are passed. However,
- I'm interested in the case that only one iterable is passed
- Even if multiple iterables were passed (not my question), it seems like an odd design choice to explicitely check the length, instead of just iterating until you run out and then returning
None
I am concerned because according to several 1 2 extremely highly upvoted questions,
map(f, iterable)
is basically equivalent to:
[f(x) for x in iterable]
But I am running into simple examples where that isn't true.
For Example
class Iterable:
def __iter__(self):
self.iterable = [1,2,3,4,5].__iter__()
return self
def next(self):
return self.iterable.next()
#def __len__(self):
# self.iterable = None
# return 5
def foo(x): return x
print( [foo(x) for x in Iterable()] )
print( map(foo,Iterable()) )
Behaves as it should, but if you uncomment the overloading of len
, it very much does not.
In this case, it raises an AttributeError because the iterable is None
. While the unit behaviour is silly, I see no requirement of invariance in the specification of len. Surely, it's good practice to not modify the state in a call to len
, but the reason should not be because of unexpectable behaviour in builtin functions. In more realistic cases, my len
function may just be slow, and I don't expect to worry about it being called by map
, or maybe it isn't thread safe, etc..
Implementation Dependent?
Since map
is a builtin function, it may have implementation-specific features outside the spec, but cpython implements it on line 918 of bltinmodule.c, which indeed states:
/* Do a first pass to obtain iterators for the arguments, and set len * to the largest of their lengths. */
And then calls _PyObject_LengthHint
, which is defined in Object/abstract.c, and indeed seems to look for an overwritten len
. This doesn't clarify to me whether this is just implementation dependent, or if I'm missing some reason that map
purposefully looks for the iterable's length against my instinct.
(Note I haven't tested this in python 3, that is why I specified python 2. In python3, map returns a generator, so at least a few of my claims aren't true)