25

I'd like to be able to overwrite some bytes at a given offset in a file using Python.

My attempts have failed miserably and resulted in:

  • overwriting the bytes at the offset but also truncating the file just after (file mode = "w" or "w+")
  • appending the bytes at the end of the file (file mode = "a" or "a+")

Is it possible to achieve this with Python in a portable way?

Foggzie
  • 9,691
  • 1
  • 31
  • 48
sebastien
  • 347
  • 1
  • 3
  • 5
  • Not really,the one you link is about *inserting* data and mine is about *replacing* existing data in place (without rewriting all the file content). – sebastien Feb 04 '09 at 21:45
  • use a module called mmap is solution for you. read this: http://stackoverflow.com/questions/125703/how-do-i-modify-a-text-file-in-python – truease.com Mar 05 '12 at 19:28
  • use a module called mmap is solution for you. read this: http://stackoverflow.com/questions/125703/how-do-i-modify-a-text-file-in-python – truease.com Mar 05 '12 at 19:31

3 Answers3

47

Try this:

fh = open("filename.ext", "r+b")
fh.seek(offset)
fh.write(bytes)
fh.close()
tzot
  • 92,761
  • 29
  • 141
  • 204
Ben Blank
  • 54,908
  • 28
  • 127
  • 156
  • I confirm that this seems to work (but not necessarily with other file modes than r+) – Kena Feb 03 '09 at 21:48
  • 1
    @Kena — The "r+" mode specifically means to open the file for (reading and) writing, leave the "pointer" at the beginning of the file, and do not truncate. The "a+" mode should also work for this, as we use seek anyway, but other modes won't. – Ben Blank Feb 03 '09 at 22:35
  • 4
    @Ben Blank: "r+" (better, "r+b") is the answer to this. "a+" would NOT work for this. Whatever the seek, a file opened with "a" or "a+" appends any writes at its end. – tzot Feb 04 '09 at 02:59
  • @ΤΖΩΤΖΙΟΥ — *checks my notes* D'oh. Right you are. :-) – Ben Blank Feb 04 '09 at 04:53
5

According to this python page you can type file.seek to seek to a particualar offset. You can then write whatever you want.

To avoid truncating the file, you can open it with "a+" then seek to the right offset.

tomjen
  • 3,779
  • 3
  • 29
  • 35
  • 3
    No, the answer is opening with "r+b" (binary since we want to overwrite bytes). A "man 3 fopen", section DESCRIPTION should explain the difference among the available modes. – tzot Feb 04 '09 at 03:02
0

Very inefficient, but I don't know any other way right now, that doesn't overwrite the bytes in the middle (as Ben Blanks one does):

a=file('/tmp/test123','r+')
s=a.read()
a.seek(0)
a.write(s[:3]+'xxx'+s[3:])
a.close()

will write 'xxx' at offset 3: 123456789 --> 123xxx456789

Johannes Weiss
  • 52,533
  • 16
  • 102
  • 136
  • 1
    Since the OP asked how to overwrite bytes, I think that overwriting the bytes is not actually a problem. – John Fouhy Feb 03 '09 at 22:31
  • Sure? quote: My attempts have failed miserably and resulted [...] either in overwriting the bytes at given offset [...] – Johannes Weiss Feb 04 '09 at 10:19
  • 2
    @Johannes Weiß — You cut that quote off right before the good part. He's lamenting the truncation, not the overwrite. – Ben Blank Feb 04 '09 at 15:33
  • Files are contiguous on disk so you can't insert into the middle of a file without shifting the remainder of the file. Yes, this is inefficient. Your implementation can be made more efficient though - you're reading the entire file into memory and then you're creating another string in memory by concatenation that's also the size of the file before writing it to disk. This would be a problem for large file/s. You should loop through the file in chunks, and avoid the concatenation by writing the portions separately. You would insert your block when you reach the correct offset during the loop. – steveayre Mar 23 '18 at 14:20
  • That method assumes you're writing to a different filename. To write to the same file the algorithm is a little more complex - you'd also need a buffer of data you've overwritten that you can write back of the same size as the block you're inserting. – steveayre Mar 23 '18 at 14:30