2

I read builtin function iter 's example in Built-in Functions — Python 3.7.0 documentation

with open('mydata.txt') as fp:
    for line in iter(fp.readline, ''):
        process_line(line)

I could not figure out what's the advantage over the following:

with open('mydata.txt') as fp:
    for line in fp:
        process_line(line)

Could you please provide any hints?

Chris_Rands
  • 38,994
  • 14
  • 83
  • 119
AbstProcDo
  • 19,953
  • 19
  • 81
  • 138
  • Actually I think I agree with you, it is useless in this case(?) Since a an empty string is only produced at the EOF I believe. However, if I'm correct, then this is an odd example to use in the docs – Chris_Rands Sep 21 '18 at 15:07
  • if instead of an empty string, you had `'a\n'` then it would be similar to adding `if line == 'a\n': break` for the 2nd snippet – Chris_Rands Sep 21 '18 at 15:07
  • 4
    Why do you suppose there *is* any advantage? It's just giving an example of the two-argument form of `iter` there. – wim Sep 21 '18 at 15:16
  • @wim So they are equivalent right? Isn't it a bad example to put in the docs then as it doesn't demonstrate the utility? Maybe they had in mind something more like this `iter(functools.partial(f.read, 1), '')` – Chris_Rands Sep 21 '18 at 15:20
  • I don't think the example is necessarily bad ("read lines of a file until a certain line is reached"), but i think using an empty string as the "certain line" was not the best choice. – wim Sep 21 '18 at 15:23
  • relevant: https://stackoverflow.com/questions/38087427/what-are-the-uses-of-itercallable-sentinel – Chris_Rands Sep 21 '18 at 21:32

2 Answers2

4

Both will iterate over a generator, without loading the whole file into memory, but the iter() version is demonstrating the use of the second argument of iter(), "sentinel".

From the docs:

if the value returned is equal to sentinel, StopIteration will be raised

So this code will read from the file, until a line equals '' and then stop.

This is a strange example, as all lines in the file will have a newline on the end, so this will only trigger at the end of the file anyway (if at all).

Jon Betts
  • 3,053
  • 1
  • 14
  • 12
1

As wim and I discussed in the comments, there is no advantage for this particular case. For the 2nd code snippet to be equivalent to the first code snippet then it would look something like this:

with open('mydata.txt') as fp:
    for line in fp:
        if line == '':
            break
        process_line(line)

However, the only case an empty string can be returned by readline is at the end of the file (EOF) so it makes now difference here (other lines contain a newline '\n' character at least).

If rather than an empty string another value was used, then the difference would be meaningful though. Personally, I think the docs should use a better example to illustrate this, like as follows:

>>> f = open('test')
>>> f.read()
'a\nb\nc\n\nd\ne\nf\n\n'
>>> f = open('test')
>>> [line for line in iter(f.readline, 'b\n')]
['a\n']
>>> f = open('test')
>>> [line for line in f]
['a\n', 'b\n', 'c\n', '\n', 'd\n', 'e\n', 'f\n', '\n']

(Note I should really be closing the file handles)

EDIT: I raised this as a possible documentation bug in issue34764

Chris_Rands
  • 38,994
  • 14
  • 83
  • 119