1

I have a .txt file whose contents are:

This is an example file.
These are its contents.
This is line 3.

If I open the file, move to the beginning, and write some text like so...

f = open(r'C:\Users\piano\Documents\sample.txt', 'r+')
f.seek(0, 0)
f.write('Now I am adding text.\n')

What I am expecting is for the file to read:

Now I am adding text.
This is an example file.
These are its contents.
This is line 3.

...but instead it reads:

Now I am adding text.
.
These are its contents.
This is line 3.

So why is some of the text being replaced instead of the text I'm writing simply being added onto the beginning? How can I fix this?

L Martin
  • 11
  • 1
  • Possible duplicate of [Prepend line to beginning of a file](https://stackoverflow.com/questions/5914627/prepend-line-to-beginning-of-a-file) – Sheldore Dec 24 '18 at 19:27
  • Despite what you may have learned from text editors, there is no "insert" mode for file writing. – user2357112 Dec 24 '18 at 19:33

2 Answers2

1

Write - will overwrite any existing content
To overcome this, you can do:

with open(r'C:\Users\piano\Documents\sample.txt', 'r+') as file:
    string = file.read()
    file.truncate(0) #delete all contents
    file.seek(0, 0)
    file.write('Now I am adding text.\n' + string)

It is also recommended you use with because it comes automatic with the close() method in its __exit__() magic method. This is important as not all Python interpreters use CPython

Bonus: If you wish to insert lines inbetween, you can do:

with open(r'C:\Users\piano\Documents\sample.txt', 'r+') as file:
    contents = file.readlines()
    contents.insert(1, 'Now I am adding text.\n') 
    #Inserting into second line
    file.truncate(0) #delete all contents
    file.seek(0, 0)
    file.writelines(contents)
ycx
  • 3,155
  • 3
  • 14
  • 26
0

Most file systems don't work like that. A file's contents is mapped to data blocks, and these data blocks are not guaranteed to be contiguous on the underlying system (i.e. not necessarily "side-by-side").

When you seek, you're seeking to a byte offset. So if you want to insert new data between 2 byte offsets of a particular block, you'll have to actually shift all subsequent data over by the length of what you're inserting. Since the block could easily be entirely "filled", shifting the bytes over might require allocating a new block. If the subsequent block was entirely "filled" as well, you'll have to shift the data of that block as well, and so on.. You can start to see why there's no "simple" operation for shifting data.

Generally, we solve this by just reading all the data into memory and then re-writing it back to a file. When you encounter the byte offset you're interested in inserting "new" content at, you write your buffer and then continue writing the "original" data. In Python, you won't have to worry about interleaving multiple buffers when writing, since Python will abstract the data to some data structure. So you'd just concatenate the higher-level data structures (e.g. if it's a text file, just concat the 3 strings).

If the file is too large for you to comfortably place it in memory, you can write to a "new" temporary file, and then just swap it with the original when done your write operation.


Now if you consider the "shifting" of data in data blocks I mentioned above, you might consider the simpler edge case where you happen to be inserting data of length N at an offset that's a multiple of N, where N is the fixed size of the data block in the file system. In this case, if you think of the data blocks as a linked list, you might consider it a rather simple operation to add a new data block between the offset you're inserting at and the next block in the list.

In fact, Linux systems do support allocating an additional block at this boundary. See fallocate.

noahnu
  • 3,479
  • 2
  • 18
  • 40