0

In Python 2.6 is there a more efficient way of searching a file line by line (for a string) and after finding it, inserting some lines into that file? So the output file would just be the same as the input file with a few lines added in between. Also, I'd rather not read these files into a buffer because the files can be very large.

Right now, I'm reading a file line by line and writing it into a temp file until I find the line I'm looking for and then inserting the extra data in the temp file. And write the rest of the data into the temp file. After I'm done processing the file, overwrite the old file with the new temp file. Something like this:

    with open(file_in_read, 'r') as inFile:
       if os.path.exists(file_in_write):
         os.remove(file_in_write)
       with open(file_in_write, 'a') as outFile:
         for line in inFile:
           if re.search((r'<search_string',line):
             write_some_data(outFile)
             outFile.write(line)
            else:
              outFile.write(line)
os.rename(src,dst)

I was just wondering if I can speed it up somehow.

umayneverknow
  • 190
  • 1
  • 3
  • 13
  • 1
    Possible duplicate of [Inserting Line at Specified Position of a Text File](https://stackoverflow.com/questions/1325905/inserting-line-at-specified-position-of-a-text-file) –  Nov 09 '17 at 03:25

2 Answers2

0

You can seek to some point of the file with file.seek and write there, but this way data will have fixed offset in the file and this is generally not what you want.

If the data need to go after some other data and this one has no fixed offset and size, then there is no way around and you need to read it to find out it's offset and size.

You may having a x,y problem. When you think that can solve x by y so you ask for help on y instead of asking for help in x. If you share what you are trying to get with these files other people may suggest better solutions.

geckos
  • 5,687
  • 1
  • 41
  • 53
0

It looks like using the fileinput module in the standard library is the way to go. You can simplify your code to:

import fileinput
import re
import sys

regex = re.compile(r'<pattern>')

for line in fileinput.input(file_in_read, inplace=True):
    sys.stdout.write(line)
    if regex.search(line):
        sys.stdout.write(additional_lines)