8

I am new to Python (2.6), and have a situation where I need to un-read a line I just read from a file. Here's basically what I am doing.

  for line in file:
     print line
     file.seek(-len(line),1)
     zz = file.readline()
     print zz

However I notice that "zz" and "line" are not the same. Where am I going wrong?

Thanks.

user721975
  • 1,257
  • 3
  • 13
  • 14
  • 5
    What benefit do you get if you were to 'un-read' a line? – quamrana May 01 '11 at 17:44
  • @quamrana : It's kind of hard to explain. That's how the code that I am manipulating is written :) – user721975 May 01 '11 at 18:05
  • 3
    Here's an example (my current use): You're reading a block of data from a file, and there's no explicit end-of-block marker, but you can recognize when you've hit the next block. In that case, you don't want the processing of block i to consume the beginning of block i+1, so "un-reading" makes a lot of sense – Eric Anderson Aug 14 '12 at 15:43

4 Answers4

12

I don't think for line in file: and seek make a good combination. Try something like this:

while True:
    line = file.readline()
    print line
    file.seek(-len(line),1)
    zz = file.readline()
    print zz

    # Make sure this loop ends somehow
Gustav Larsson
  • 8,199
  • 3
  • 31
  • 51
  • I just found it out the hard way. I am coming from a Perl background and am still making adjustments to Python. Thanks for your answer, that worked. – user721975 May 01 '11 at 18:01
  • 3
    `.readline()` returns an empty string `''` on EOF, so the exit condition is `if not line: break`. – jfs May 01 '11 at 18:54
3

You simply cannot mix iterators and seek() this way. You should pick one method and stick to it.

David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
2

You can combine the iteration over lines with a .seek() operation:

for i, line in enumerate(iter(f.readline, ''), 1):
    print i, line,
    if i == 2: # read 2nd line two times
       f.seek(-len(line), os.SEEK_CUR)

If a file contains:

a
b
c

Then the output would be:

1 a
2 b
3 b
4 c
jfs
  • 399,953
  • 195
  • 994
  • 1,670
2

Untested. Basically, you want to maintain a lifo cache of 'unread' lines. On each read of a line, if there is something in the cache, you take it out of the cache first. If there's nothing in the cache, read a new line from the file. This is rough, but should get you going.

lineCache = []

def pushLine(line):
    lineCache.append(line)

def nextLine(f):
    while True:
        if lineCache:
            yield lineCache.pop(0)
        line = f.readline()
        if not line:
            break
        yield line
    return

f = open('myfile')

for line in nextLine(f):
    # if we need to 'unread' the line, call pushLine on it.  The next call to nextLine will
    # return the that same 'unread' line.
    if some_condition_that_warrants_unreading_a_line:
        pushLine(line)
        continue
    # handle line that was read.
Michael Kent
  • 1,736
  • 12
  • 11
  • http://code.activestate.com/recipes/502304-iterator-wrapper-allowing-pushback-and-nonzero-tes/ – tzot May 02 '11 at 00:06