12

I would like to insert a string at a specific column of a specific line in a file, without reading and rewriting the entire file.

Suppose I have a file file.txt

How was the English test?
How was the Math test?
How was the Chemistry test?
How was the test?

I would like to change the last line to say How was the History test? by adding the string History at line 4 column 13.

Currently I read in every line of the file and add the string to the specified position.

with open("file.txt", "r+") as f:
    # Read entire file
    lines = f.readlines()

    # Update line
    lino = 4 - 1
    colno = 13 -1
    lines[lino] = lines[lino][:colno] + "History " + lines[lino][colno:]

    # Rewrite file
    f.seek(0)
    for line in lines:
        f.write(line)
    f.truncate()
    f.close()

But I feel like I should be able to simply add the line to the file without having to read and rewrite the entire file.

Increasingly Idiotic
  • 5,700
  • 5
  • 35
  • 73
  • 5
    Do not always trust your feelings. Unless the old word and the replacement have exactly the same length, the only way to modify the file content is to read the file, modify the content, and write it back. – DYZ Apr 09 '18 at 23:15
  • @DyZ not exactly, there is a better way. Sanity test: add a string at the end of file – Marat Apr 09 '18 at 23:16
  • 1
    @Marat But please read the first line of the question. – DYZ Apr 09 '18 at 23:17
  • At the very least there should be a way to only modify the file at the change point onward. What if I want to change the last line of hundred thousand line file? Reading and rewriting the entire file cannot be the best solution. – Increasingly Idiotic Apr 09 '18 at 23:17
  • 4
    You still have to read the whole file - but you should write only the modified line and everything after it. – DYZ Apr 09 '18 at 23:18
  • 4
    You can scan to the desired position and write the desired text, but then you have to rewrite the remainder of the file. – Prune Apr 09 '18 at 23:19
  • @DyZ I misinterpreted the first comment. It sounded almost like read everything, modify, write everything. The next next one is what I meant - read up to the point, modify, rewrite the remainder – Marat Apr 09 '18 at 23:26
  • More or less a duplicate of https://stackoverflow.com/q/39086/2564301 and every other question that assumes you can somehow "insert" or "delete" text inside an existing file. – Jongware Apr 09 '18 at 23:41
  • I think instead of using text files to store this data, you should try using csv's along with the csv module. Column and rows are easier to process with csv's. – c_ure_sh Apr 09 '18 at 23:41
  • @c_ure_sh Unfortunately the end result needs to be a text file – Increasingly Idiotic Apr 10 '18 at 16:01
  • 1
    a csv is a text file, it is just structured. If you name file data.csv data.txt, Python does not care. There is another question similar to this on SO, and it is far more complex than it seems to replace text mid file. https://stackoverflow.com/questions/17140886/how-to-search-and-replace-text-in-a-file-using-python The standard approach, read the file into a data structure, update the date, and write to the file, as recommended by @Jack Aidley – diek Apr 13 '18 at 00:33
  • @DyZ I implemented an answer that does what you suggested. – Chris Hagmann Apr 27 '18 at 19:08

5 Answers5

3

This is possibly a duplicate of below SO thread

Fastest Way to Delete a Line from Large File in Python

In above it's a talk about delete, which is just a manipulation, and yours is more of a modification. So the code would get updated like below

def update(filename, lineno, column, text):
    fro = open(filename, "rb")

    current_line = 0
    while current_line < lineno - 1:
        fro.readline()
        current_line += 1

    seekpoint = fro.tell()
    frw = open(filename, "r+b")
    frw.seek(seekpoint, 0)

    # read the line we want to update
    line = fro.readline()
    chars = line[0: column-1] + text + line[column-1:]

    while chars:
        frw.writelines(chars)
        chars = fro.readline()

    fro.close()
    frw.truncate()
    frw.close()


if __name__ == "__main__":
    update("file.txt", 4, 13, "History ")

In a large file it make sense to not make modification till the lineno where the update needs to happen, Imagine you have file with 10K lines and update needs to happen at 9K, your code will load all 9K lines of data in memory unnecessarily. The code you have would work still but is not the optimal way of doing it

Tarun Lalwani
  • 142,312
  • 9
  • 204
  • 265
2

The function readlines() reads the entire file. But it doesn't have to. It actually reads from the current file cursor position to the end, which happens to be 0 right after opening. (To confirm this, try f.tell() right after with statement.) What if we started closer to the end of the file?

The way your code is written implies some prior knowledge of your file contents and layouts. Can you place any constraints on each line? For example, given your sample data, we might say that lines are guaranteed to be 27 bytes or less. Let's round that to 32 for "power of 2-ness" and try seeking backwards from the end of the file.

# note the "rb+"; need to open in binary mode, else seeking is strictly
# a "forward from 0" operation.  We need to be able to seek backwards
with open("file.txt", "rb+") as f:
    # caveat: if file is less than 32 bytes, this will throw
    # an exception.  The second parameter, 2, says "from end of file"
    f.seek(-32, 2)

    last = f.readlines()[-1].decode()

At which point the code has only read the last 32 bytes of the file.1 readlines() (at the byte level) will look for the line end byte (in Unix, \n or 0x0a or byte value 10), and return the before and after. Spelled out:

>>> last = f.readlines()
>>> print( last )
[b'hemistry test?\n', b'How was the test?']

>>> last = last[-1]
>>> print( last )
b'How was the test?'

Crucially, this works robustly under UTF-8 encoding by exploiting the UTF-8 property that ASCII byte values under 128 do not occur when encoding non-ASCII bytes. In other words, the exact byte \n (or 0x0a) only ever occurs as a newline and never as part of a character. If you are using a non-UTF-8 encoding, you will need to check if the code assumptions still hold.

Another note: 32 bytes is arbitrary given the example data. A more realistic and typical value might be 512, 1024, or 4096. Finally, to put it back to a working example for you:

with open("file.txt", "rb+") as f:
    # caveat: if file is less than 32 bytes, this will throw
    # an exception.  The second parameter, 2, says "from end of file"
    f.seek(-32, 2)

    # does *not* read while file, unless file is exactly 32 bytes.
    last = f.readlines()[-1]
    last_decoded = last.decode()

    # Update line
    colno = 13 -1
    last_decoded = last_decoded[:colno] + "History " + last_decoded[colno:]

    last_line_bytes = len( last )
    f.seek(-last_line_bytes, 2)
    f.write( last_decoded.encode() )
    f.truncate()

Note that there is no need for f.close(). The with statement handles that automatically.

1 The pedantic will correctly note that the computer and OS will likely have read at least 512 bytes, if not 4096 bytes, relating to the on-disk or in-memory page size.

hunteke
  • 3,648
  • 1
  • 7
  • 17
1

You can use this piece of code :

with open("test.txt",'r+') as f:
    # Read the file 
    lines=f.readlines()

    # Gets the column
    column=int(input("Column:"))-1

    # Gets the line
    line=int(input("Line:"))-1

    # Gets the word
    word=input("Word:")

    lines[line]=lines[line][0:column]+word+lines[line][column:]

    # Delete the file
    f.seek(0)

    for i in lines:
        # Append the lines
        f.write(i)
Paul Rooney
  • 20,879
  • 9
  • 40
  • 61
1

This answer will only loop through the file once and only write everything after the insert. In cases where the insert is at the end there is almost no overhead and where the insert at the beginning it is no worse than a full read and write.

def insert(file, line, column, text):
    ln, cn = line - 1, column - 1         # offset from human index to Python index
    count = 0                             # initial count of characters
    with open(file, 'r+') as f:           # open file for reading an writing
        for idx, line in enumerate(f):    # for all line in the file
            if idx < ln:                  # before the given line
                count += len(line)        # read and count characters 
            elif idx == ln:               # once at the line                                 
                f.seek(count + cn)        # place cursor at the correct character location
                remainder = f.read()      # store all character afterwards                       
                f.seek(count + cn)        # move cursor back to the correct character location
                f.write(text + remainder) # insert text and rewrite the remainder
                return                    # You're finished!
Chris Hagmann
  • 1,086
  • 8
  • 14
0

I'm not sure whether you were having problems changing your file to contain the word "History", or whether you wanted to know how to only rewrite certain parts of a file, without having to rewrite the whole thing.

If you were having problems in general, here is some simple code which should work, so long as you know the line within the file that you want to change. Just change the first and last lines of the program to read and write statements accordingly.

fileData="""How was the English test?
How was the Math test?
How was the Chemistry test?
How was the test?""" # So that I don't have to create the file, I'm writing the text directly into a variable.
fileData=fileData.split("\n")
fileData[3]=fileData[3][:11]+" History"+fileData[3][11:] # The 3 referes to the line to add "History" to. (The first line is line 0)
storeData=""
for i in fileData:storeData+=i+"\n"
storeData=storeData[:-1]
print(storeData) # You can change this to a write command.

If you wanted to know how to change specific "parts" to a file, without rewriting the whole thing, then (to my knowledge) that is not possible.

Say you had a file which said Ths is a TEST file., and you wanted to correct it to say This is a TEST file.; you would technically be changing 17 characters and adding one on the end. You are changing the "s" to an "i", the first space to an "s", the "i" (from "is") to a space, etc... as you shift the text forward.

A computer can't actually insert bytes between other bytes. It can only move the data, to make room.

Programmer S
  • 429
  • 7
  • 21