line in iter(fp.readline, '') rather than line in fp:

Question

I read builtin function iter 's example in Built-in Functions — Python 3.7.0 documentation

with open('mydata.txt') as fp:
    for line in iter(fp.readline, ''):
        process_line(line)

I could not figure out what's the advantage over the following:

with open('mydata.txt') as fp:
    for line in fp:
        process_line(line)

Could you please provide any hints?

Actually I think I agree with you, it is useless in this case(?) Since a an empty string is only produced at the EOF I believe. However, if I'm correct, then this is an odd example to use in the docs — Chris_Rands, Sep 21 '18 at 15:07
if instead of an empty string, you had `'a\n'` then it would be similar to adding `if line == 'a\n': break` for the 2nd snippet — Chris_Rands, Sep 21 '18 at 15:07
Why do you suppose there *is* any advantage? It's just giving an example of the two-argument form of `iter` there. — wim, Sep 21 '18 at 15:16
@wim So they are equivalent right? Isn't it a bad example to put in the docs then as it doesn't demonstrate the utility? Maybe they had in mind something more like this `iter(functools.partial(f.read, 1), '')` — Chris_Rands, Sep 21 '18 at 15:20
I don't think the example is necessarily bad ("read lines of a file until a certain line is reached"), but i think using an empty string as the "certain line" was not the best choice. — wim, Sep 21 '18 at 15:23
relevant: https://stackoverflow.com/questions/38087427/what-are-the-uses-of-itercallable-sentinel — Chris_Rands, Sep 21 '18 at 21:32

Jon Betts · Answer 1 · 2018-09-21T15:23:20.857

4

Both will iterate over a generator, without loading the whole file into memory, but the iter() version is demonstrating the use of the second argument of iter(), "sentinel".

From the docs:

if the value returned is equal to sentinel, StopIteration will be raised

So this code will read from the file, until a line equals '' and then stop.

This is a strange example, as all lines in the file will have a newline on the end, so this will only trigger at the end of the file anyway (if at all).

edited Sep 21 '18 at 15:23

answered Sep 21 '18 at 15:16

Jon Betts

3,053
1
14
12

2

You will only get empty string at the end of file. Blank lines would be a '\n' during iteration. – wim Sep 21 '18 at 15:20
Thanks, I've updated the answer to be clearer about that. – Jon Betts Sep 21 '18 at 15:23

Chris_Rands · Answer 2 · 2018-09-21T15:42:40.663

As wim and I discussed in the comments, there is no advantage for this particular case. For the 2nd code snippet to be equivalent to the first code snippet then it would look something like this:

with open('mydata.txt') as fp:
    for line in fp:
        if line == '':
            break
        process_line(line)

However, the only case an empty string can be returned by readline is at the end of the file (EOF) so it makes now difference here (other lines contain a newline '\n' character at least).

If rather than an empty string another value was used, then the difference would be meaningful though. Personally, I think the docs should use a better example to illustrate this, like as follows:

>>> f = open('test')
>>> f.read()
'a\nb\nc\n\nd\ne\nf\n\n'
>>> f = open('test')
>>> [line for line in iter(f.readline, 'b\n')]
['a\n']
>>> f = open('test')
>>> [line for line in f]
['a\n', 'b\n', 'c\n', '\n', 'd\n', 'e\n', 'f\n', '\n']

(Note I should really be closing the file handles)

EDIT: I raised this as a possible documentation bug in issue34764

line in iter(fp.readline, '') rather than line in fp:

2 Answers2

Linked