3

I process one file: skip the header (comment), process the first line, process other lines.

f = open(filename, 'r')

# skip the header
next(f)  

# handle the first line
line =  next(f)  
process_first_line(line)

# handle other lines
for line in f:
    process_line(line)

If line = next(f) is replaced with line = f.readline(), it will encounter the error.

ValueError: Mixing iteration and read methods would lose data

Therefore, I would like to know the differences among next(f), f.readline() and f.next() in Python?

SparkAndShine
  • 17,001
  • 22
  • 90
  • 134
  • The `next(iter)` function calls `iter.next()`, and will handle the `StopIteration` exception if you give `next()` a second argument. See the dupe for details on `next()` versus `file.readline()`. – Martijn Pieters Nov 25 '15 at 18:14
  • The duplicate covers the same idea, which is the inconsistency resulting from using both `readline` and `next`. However, this post specifically asks what the "difference" is between the two, which differs from what the dup is asking. Someone else asking the same question will likely find this post instead of the dup. In addition, the accepted answer doesn't really answer the question, as it doesn't explain what `readline` does. – orodbhen Jan 30 '18 at 15:52

1 Answers1

3

Quoting official Python documentation,

A file object is its own iterator, for example iter(f) returns f (unless f is closed). When a file is used as an iterator, typically in a for loop (for example, for line in f: print line.strip()), the next() method is called repeatedly. This method returns the next input line, or raises StopIteration when EOF is hit when the file is open for reading (behavior is undefined when the file is open for writing). In order to make a for loop the most efficient way of looping over the lines of a file (a very common operation), the next() method uses a hidden read-ahead buffer. As a consequence of using a read-ahead buffer, combining next() with other file methods (like readline()) does not work right.

Basically, when the next function is called on a Python's file object, it fetches a certain number of bytes from the file and processes them and returns only the current line (end of current line is determined by the newline character). So, the file pointer is moved. It will not be at the same position where the current returned line ends. So, calling readline on it will give inconsistent result. That is why mixing both of them are not allowed.

thefourtheye
  • 233,700
  • 52
  • 457
  • 497
  • 4
    While it addresses the error the OP was encountering, this doesn't really answer the question as stated. It only explains how `next()` works, and not the other functions. – orodbhen Jan 30 '18 at 15:54