1

This routine looks ok to me but ends up writing rubbish to the file. lines_of_interest is a set of lines (896227L, 425200L, 640221L, etc) that need to be changed in the file. The if else routine determines what is changed on that line. This is the first time I have used seek() but believe the syntax is correct. Can anyone spot any errors in the code that will get it working correctly?

outfile = open(OversightFile, 'r+')
for lines in lines_of_interest:
        for change_this in outfile:
            line = change_this.decode('utf8', 'replace')
            outfile.seek(lines)
            if replacevalue in line:
                line = line.replace(replacevalue, addValue)
                outfile.write(line.encode('utf8', 'replace'))
                break#Only check 1 line
            elif not addValue in line:
                #line.extend(('_w\t1\t'))
                line = line.replace("\t\n", addValue+"\n")
                outfile.write(line.encode('utf8', 'replace'))
                break#Only check 1 line
outfile.close()
tshepang
  • 12,111
  • 21
  • 91
  • 136
jhmiller
  • 99
  • 1
  • 8
  • I presume it's ``for line in lines_of_interest:`` , not ``for lines in lines_of_interest:`` , am I right ? – eyquem Aug 13 '13 at 19:19
  • Your code seems to me to be full of bizarenesses. First, ``896227L, 425200L, 640221L, etc)`` isn't a set , it's a tuple. Second, you mean that each element of this tuple is the "order" number of a line ? If so, you cant' write ``seek(line)`` where line is an "order" number; the first argument of seek() must be a number of **characters** – eyquem Aug 13 '13 at 19:29
  • I am no python expert, I read forums and examples and **try** to learn. When ever I get stuck I ask here as you guys are the experts. All the examples I have seen use "search_offset = infile.tell() - len(line) - 1" to get the current position, is there is a better method to get the position? – jhmiller Aug 13 '13 at 22:08

2 Answers2

2

You should think of files as unchangeable(unless you want to append to file). If you want to change the existing lines in a file, here are the steps:

  1. Read each line from your input file, e.g. data.txt
  2. Write every line including the changed lines to an output file, e.g. new_file.txt
  3. Delete the input file.
  4. Rename the output file to the input file name.

One problem you don't want to have to deal with in step 2) is trying to conjure up a filename that doesn't already exist. The tempfile module will do that for you.

The fileinput module can be used to do all those steps transparently:

#1.py
import fileinput as fi

f = fi.FileInput('data.txt', inplace=True)

for line in f:
    print "***" + line.rstrip()

f.close()

--output:--
$ cat data.txt
abc
def
ghi
$ python 1.py 
$ cat data.txt
***abc
***def
***ghi

The fileinput module opens the filename you give it and renames the file. Then print statements are directed into the an empty file created with the original name. When you are done, the renamed file is deleted (or you can specify that it should remain).

7stud
  • 46,922
  • 14
  • 101
  • 127
1

You are both looping over the file and seeking in it, multiple times, but never reset the position before reading again.

In the first iteration, you read the first line, then you seek elsewhere into the file, write to that position, then break out of the for change_this in out_file: loop.

The next iteration of the for lines in lines_of_interest: loop then starts reading from outfile again, but the file is now positioned at the point where the last outfile.write() left off. That means you are now reading whatever followed the data you just have written.

This is probably not what you wanted to do.

If you wanted to read the line from the same position, and write it back to the same location, you need to seek first and use .readline() instead of iteration to read your line. Then seek again before writing:

outfile = open(OversightFile, 'r+')

for position in lines_of_interest:
    outfile.seek(position)
    line = outfile.readline().decode('utf8', 'replace')
    outfile.seek(position)
    if replacevalue in line:
        line = line.replace(replacevalue, addValue)
        outfile.write(line.encode('utf8'))
    elif not addValue in line:
        line = line.replace("\t\n", addValue+"\n")
        outfile.write(line.encode('utf8')

Note however, that if you write out data that is shorter or longer than the original line, the file size will not adjust! Writing a longer line will overwrite the first characters of the next line, a shorter write will leave the trailing characters of the old line in the file.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • The first for gets the line to edit, the next for gets the line from the file. I hope the seek() then moves to the line I want to change. Then the if/else makes the changes to that line. The breaks are used to stop reading lines in outfile and return to the first for to get the next line to edit, and so on. That's how I wanted it to work. – jhmiller Aug 13 '13 at 17:59
  • @user1620852: No, the next `for` gets the **first** line in the file. If you wanted to read a line at a specific position, you'd have to seek to that line and use `.readline()` instead. – Martijn Pieters Aug 13 '13 at 18:00
  • I thought that the second for would start from the beginning but I also thought that the seek() moved to the line. – jhmiller Aug 13 '13 at 18:08
  • @user1620852: but you don't seek until *after* the `for` loop has already started reading. – Martijn Pieters Aug 13 '13 at 18:10
  • 9 times out of ten the line will need to be added to making it longer so this method is a non starter? is it possible to delete the old line and add a new line at the end of the file, where the elif applies? – jhmiller Aug 13 '13 at 18:23
  • @user1620852 I believe you shouldn't think of "changing lines". 99.9% of the time you will have to re-write the whole file, so you are making the code much more complex for nothing. The only situation in which you can `tell()`/`seek()` to modify specific "lines" is when your file is made of fixed-length data blocks. – Bakuriu Aug 13 '13 at 18:32
  • You cannot 'delete' a line either; your best bet is to just *rewrite* the file altogether. – Martijn Pieters Aug 13 '13 at 18:43
  • See [How to write to a specific line of a file?](http://stackoverflow.com/q/18164460) – Martijn Pieters Aug 13 '13 at 18:44
  • So the best way is to scan each line to a new file and add replaceValue for lines that do not have it and then process the new file, this way I can use the above method? – jhmiller Aug 13 '13 at 19:04
  • I'd use the `fileinput` module to handle the details but essentially you need to write all lines to a new file, fix lines along the way, then replace the old file with the new. See http://stackoverflow.com/a/16485997, http://stackoverflow.com/a/16701843, http://stackoverflow.com/a/16763473 and http://stackoverflow.com/a/15657512 for examples. – Martijn Pieters Aug 13 '13 at 19:28