In python what is the purpose of having an iterable and an iterators, two separate objects?

Question

I know passing iterable as an argument to iter() method returns an iterator. So why can't iterable be always an iterator. What is the purpose of having an iterable object if doesn't have __next__ method?

score 2 · Answer 1 · answered Jun 21 '20 at 17:40

Think of iterable as an special talent of an object. It can be iterated over, e.g. when using for loop or using unpacking.

An iterator is an object which is responsible for delivering data from something. This means you can have several of these objects all delivering independent data from the same underlying object.

score 1 · Answer 2 · answered Jun 21 '20 at 17:39

You can only iterate over an iterator once. Think of iterators as functions that return objects one-by-one. You can only cycle through them one time, and you have to cycle through in the preset order.

Iterables are objects that you can iterate over, but unlike iterators, they are unaffected by iteration and can be accessed in other ways. It's possible to index into an iterable, but not an iterator. This means that I can access the tenth, seventh, or last element of an iterable without needing any other elements, but I need to cycle through the preceding elements of an iterator to get to those elements.

A more in-depth explanation can be found at this answer to a similar question.

score 0 · Answer 3 · answered Jun 21 '20 at 18:06

Classes decide how they are going to be iterated based on what is returned from the __iter__ method. Sometimes iterables are their own iterator (e.g., a file object) and sometimes iterables create separate iterator objects (e.g., a list). Its up to the developer to decide which implementation is best.

In the case of a file object, it only has a single current position and reads will always continue at that point. It doesn't make sense to have unique iterators that would continually have to swap file position to read properly. Similarly with streaming protocols that can't rewind at all.

Generators are like file objects and streams. They can't change position so they can be their own iterator.

For a list object though, it would be strange if only one code entity could iterate through it at a time. list objects return a separate iterator that tracks current position in the list for that one iterator only.

The difference between these two approaches to iteration can break code, or at least make it less usable. Consider a file processor that works with multiline records. It could use an inner for to continue iterating lines of the file.

def file_processor(f):
    for line in f:
        if line.startswith('newrecord'):
            for line in f:
                print(line,strip())
                if line.startswith('endrecord'):
                    break

But this breaks if you pass in a list because that inner for will start at the top of the list again. You could change it to work with more objects by having it explicitly get an iterator

def file_processor(f):
    iter_f = iter(f)
    for line in iter_f:
        if line.startswith('newrecord'):
            for line in iter_f:
                print(line,strip())
                if line.startswith('endrecord'):
                    break

score 0 · Answer 4 · answered Jun 21 '20 at 18:49

As an example of an iterable which is not itself an iterator, let's take a list. An iterator over a list needs to contain state, namely the index number of the next item to be fetched. A list itself does not contain this state. But let's look at an example where we have a list, and generate an iterator from it and use that in place of the list, in order to demonstrate how otherwise working code would break if a list was itself an iterator.

The key issue is that we are looping over the list more than once. In this example, the loops are nested, but similar problems would occur if the loops were encountered sequentially.

names = ["Brontolo", "Cucciolo", "Dotto", "Eolo",
         "Gongolo", "Mammolo", "Pisolo"]  # This is not an iterator...

names = iter(names)  # ... but let's simulate what would happen if it was.

for name1 in names:
    for name2 in names:
        if name1 == name2:
            print(f"{name1} looks in the mirror")
        else:
            print(f"{name1} looks at {name2}")

Output:

Brontolo looks at Cucciolo
Brontolo looks at Dotto
Brontolo looks at Eolo
Brontolo looks at Gongolo
Brontolo looks at Mammolo
Brontolo looks at Pisolo

This does not work properly at all, because the two loops are sharing the same iterator. On the first iteration of the outer name1 loop, the index is incremented. Then the inner name2 loop misses out the first item and loops from the second until the last item. Then on the next attempted iteration of the outer loop, the index is already pointing at the end of the list, and the loop terminates.

Now comment out the names = iter(names) statement, and of course it works as intended. What happens this time is that because a list does not have a __next__ method, when a statement like for name1 in names: is encountered, a new iterator is generated on the fly to yield the values of name1, and it is this iterator that contains the index, rather than the list itself. On each iteration of the outer loop, an entirely separate iterator object is similarly generated for the inner loop, which can then be iterated over independently.

In python what is the purpose of having an iterable and an iterators, two separate objects?

4 Answers4