Is running a bash command to copy paste a file in Python3 a reliable option?

Question

Copying a file in Python3 using the following code takes a lot of time:

shutil.copy(self.file, self.working_dir)

However, the cp command of Linux is pretty fast. If I try to execute the bash command from Python3 for copying files with sizes greater than 100GB, will that be a reliable option for production servers?

I have seen this answer but its suggestions are not fast.

One possible issue you might encounter using `shutil.copy()` is that on POSIX platforms, file owner and group, aswell as ACLs [will be lost](https://docs.python.org/3/library/shutil.html) during transfer. `shutil.copyfile()` should suit your needs just fine. Have you ran any benchmarks and seen actual performance degradation? — hyperTrashPanda, Feb 04 '19 at 12:07
Another thing to consider is that for a 100GB file, using the main thread may block a lot and that might not be what you want, so a new thread or a subprocess would be pretty much required to background the large copy task. — Sam Rockett, Feb 04 '19 at 12:14

Haroldo_OK · Accepted Answer · 2019-02-04T12:53:44.950

If you are running on Windows, Python's copy buffer size may be too small: https://stackoverflow.com/a/28584857/679240

You would need to implement something similar to this (warning: untested):

def copyfile_largebuffer(src, dst, length=16*1024*1024):
    with open(newfile, 'wb') as outfile, open(oldfile, 'rb') as infile:
        copyfileobj_largebuffer(infile, outfile, length=length)

def copyfileobj_largebuffer(fsrc, fdst, length=16*1024*1024):
    while 1:
        buf = fsrc.read(length)
        if not buf:
            break
        fdst.write(buf)

Is running a bash command to copy paste a file in Python3 a reliable option?

1 Answers1