Memory conscious way of adding bytes to the beginning of a file

Question

I am trying to write a byte array at the beginning of a file and at a (much) later point I want to split them again, an retrieve the original file. the byte_array is just a small jpeg.

# write a byte array at the beginning of a file
def write_byte_array_to_beginning_of_file( byte_array, file_path, out_file_path ):
    with open( file_path, "rb" ) as f:
        with open( out_file_path, "wb" ) as f2:
            f2.write( byte_array )
            f2.write( f.read( ) )

while the function works, it hogs a lot of memory. It seems like it reads the files to memory first befor doing something. There are some files in excess of 40gb that i need to work on, and it's only done on a small NAS with 8Gb of RAM.

What would be a memory conscious to achieve this?

Prepending to a 40gb file is generally going to be very slow. Ideally append, but if that's not an option then you could leave enough space at the start of the file so you can overwrite the blank section of bytes without actually changing its length. As another alternative (and this is filesystem specific) you can potentially add file sectors to the start of a file instead and write to just those, leaving the rest of the file untouched. — Luke Briggs, Jul 31 '22 at 00:50

score 4 · Accepted Answer · answered Jul 31 '22 at 00:44

4

You can read from the original file in chunks instead of reading the whole thing.

def write_byte_array_to_beginning_of_file( byte_array, file_path, out_file_path, chunksize = 10 * 1024 * 1024 ):
    with open( file_path, "rb" ) as f, open( out_file_path, "wb" ) as f2:
        f2.write( byte_array )
        while True:
            block = f.read(chunksize)
            if not block:
                break
            f2.write(block)

This reads it in chunks of 10 MB by default, which you can override.

answered Jul 31 '22 at 00:44

Barmar

741,623
53
500
612

3

Or use [shutil.copyfileobj](https://stackoverflow.com/a/1001587/12671057). – Kelly Bundy Jul 31 '22 at 00:53
Good idea, post it as an answer. – Barmar Jul 31 '22 at 00:54
I'd rather close it as duplicate (although that one was the closest I found and it's still not quite as close as I'd like it). – Kelly Bundy Jul 31 '22 at 00:59
nice, that seems to work pretty well! I'll test tomorrow if i can still "reverse" it, but I don't see a reason why not, Thanks a lot! – globus243 Jul 31 '22 at 01:07
@globus243 Take a look at the other question they linked to. – Barmar Jul 31 '22 at 01:58

Memory conscious way of adding bytes to the beginning of a file

1 Answers1