6

Many of Python's built-in functions (any(), all(), sum() to name some) take iterables but why does len() not?

One could always use sum(1 for i in iterable) as an equivalent, but why is it len() does not take iterables in the first place?

tshepang
  • 12,111
  • 21
  • 91
  • 136
iruvar
  • 22,736
  • 7
  • 53
  • 82
  • possible duplicate of [Getting number of elements in an iterator in Python](http://stackoverflow.com/questions/3345785/getting-number-of-elements-in-an-iterator-in-python) – jamylak Jul 13 '12 at 01:57
  • 1
    Changed title since it was misleading as `len` does support *iterables* just not *iterators* – jamylak Jul 13 '12 at 02:03

2 Answers2

12

Many iterables are defined by generator expressions which don't have a well defined len. Take the following which iterates forever:

def sequence(i=0):
    while True:
        i+=1
        yield i

Basically, to have a well defined length, you need to know the entire object up front. Contrast that to a function like sum. You don't need to know the entire object at once to sum it -- Just take one element at a time and add it to what you've already summed.

Be careful with idioms like sum(1 for i in iterable), often it will just exhaust iterable so you can't use it anymore. Or, it could be slow to get the i'th element if there is a lot of computation involved. It might be worth asking yourself why you need to know the length a-priori. This might give you some insight into what type of data-structure to use (frequently list and tuple work just fine) -- or you may be able to perform your operation without needing calling len.

mgilson
  • 300,191
  • 65
  • 633
  • 696
  • 1
    Good point about exhaustion. Even for generators known to yield a finite number of elements, there might be no way to determine how many elements there are short of actually taking them (say you yield [1] if some incredibly slow calculation returns 3 and [1,2] otherwise, or you yield a random number of elements, etc.) – DSM Jul 13 '12 at 01:59
7

This is an iterable:

def forever():
    while True:
        yield 1

Yet, it has no length. If you want to find the length of a finite iterable, the only way to do so, by definition of what an iterable is (something you can repeatedly call to get the next element until you reach the end) is to expand the iterable out fully, e.g.:

len(list(the_iterable))

As mgilson pointed out, you might want to ask yourself - why do you want to know the length of a particular iterable? Feel free to comment and I'll add a specific example.

If you want to keep track of how many elements you have processed, instead of doing:

num_elements = len(the_iterable)
for element in the_iterable:
    ...

do:

num_elements = 0
for element in the_iterable:
    num_elements += 1
    ...

If you want a memory-efficient way of seeing how many elements end up being in a comprehension, for example:

num_relevant = len(x for x in xrange(100000) if x%14==0)

It wouldn't be efficient to do this (you don't need the whole list):

num_relevant = len([x for x in xrange(100000) if x%14==0])

sum would probably be the most handy way, but it looks quite weird and it isn't immediately clear what you're doing:

num_relevant = sum(1 for _ in (x for x in xrange(100000) if x%14==0))

So, you should probably write your own function:

def exhaustive_len(iterable):
    length = 0
    for _ in iterable: length += 1
    return length

exhaustive_len(x for x in xrange(100000) if x%14==0)

The long name is to help remind you that it does consume the iterable, for example, this won't work as you might think:

def yield_numbers():
    yield 1; yield 2; yield 3; yield 5; yield 7

the_nums = yield_numbers()
total_nums = exhaustive_len(the_nums)
for num in the_nums:
    print num

because exhaustive_len has already consumed all the elements.


EDIT: Ah in that case you would use exhaustive_len(open("file.txt")), as you have to process all lines in the file one-by-one to see how many there are, and it would be wasteful to store the entire file in memory by calling list.

Claudiu
  • 224,032
  • 165
  • 485
  • 680
  • Instead of your `num_elements += 1`, you could just do `for num, thing in enumerate(the_iterable): ...`. It depends on what you're doing with it, I suppose. – detly Jul 13 '12 at 02:41
  • Claudiu, thank you for the detailed examples. My specific use case involves calculating the number of records in a large file. len(open("file.txt")) would have been handy. – iruvar Jul 13 '12 at 02:41