23

I am trying to copy a large file (> 1 GB) from hard disk to usb drive using shutil.copy. A simple script depicting what I am trying to do is:-

import shutil
src_file = "source\to\large\file"
dest = "destination\directory"
shutil.copy(src_file, dest)

It takes only 2-3 min on linux. But the same file copy on same file takes more that 10-15 min under Windows. Can somebody explain why and give some solution, preferably using python code?

Update 1

Saved the file as test.pySource file size is 1 GB. Destinantion directory is in USB drive. Calculated file copy time with ptime. Result is here:-

ptime.exe test.py

ptime 1.0 for Win32, Freeware - http://www.
Copyright(C) 2002, Jem Berkes <jberkes@pc-t

===  test.py ===

Execution time: 542.479 s

542.479 s == 9 min. I don't think shutil.copy should take 9 min for copying 1 GB file.

Update 2

Health of the USB is good as same script works well under Linux. Calculated time with same file under windows native xcopy.Here is the result.

ptime 1.0 for Win32, Freeware - http://www.pc-tools.net/
Copyright(C) 2002, Jem Berkes <jberkes@pc-tools.net>

===  xcopy F:\test.iso L:\usb\test.iso
1 File(s) copied

Execution time: 128.144 s

128.144 s == 2.13 min. I have 1.7 GB free space even after copying test file.

sundar_ima
  • 3,604
  • 8
  • 33
  • 52
  • With than many informations, I'm afraid that the answer to your question is "no" ! – hivert Feb 15 '14 at 15:02
  • The question is very simple and the script also very simple just to copy a file from source to destination. Added few lines to look like a python script. I dont understand why down voted :-( – sundar_ima Feb 15 '14 at 15:22
  • I suspect that part of the issue here is that Windows has recognised the USB disk as a portable device and has optimised it for fast removal (at the expense of caching). See for example https://www.thewindowsclub.com/enable-disable-disk-write-caching-windows-7-8 – Cameron Kerr Jul 20 '18 at 03:29
  • Python 3.8 includes a major overhaul / speed up of shutil. See answer below for links – fantabolous Oct 29 '19 at 09:00

2 Answers2

34

Just to add some interesting information: WIndows does not like the tiny buffer used on the internals of the shutil implementation.

I've quick tried the following:

  • Copied the original shutil.py file to the example script folder and renamed it to myshutil.py
  • Changed the 1st line to import myshutil
  • Edited the myshutil.py file and changed the copyfileobj from

def copyfileobj(fsrc, fdst, length=16*1024):

to

def copyfileobj(fsrc, fdst, length=16*1024*1024):

Using a 16 MB buffer instead of 16 KB caused a huge performance improvement.

Maybe Python needs some tuning targeting Windows internal filesystem characteristics?

Edit:

Came to a better solution here. At the start of your file, add the following:

import shutil

def _copyfileobj_patched(fsrc, fdst, length=16*1024*1024):
    """Patches shutil method to hugely improve copy speed"""
    while 1:
        buf = fsrc.read(length)
        if not buf:
            break
        fdst.write(buf)
shutil.copyfileobj = _copyfileobj_patched

This is a simple patch to the current implementation and worked flawlessly here.

Python 3.8+: Python 3.8 includes a major overhaul of shutil, including increasing the windows buffer from 16KB to 1MB (still less than the 16MB suggested in this ticket). See ticket , diff

fantabolous
  • 21,470
  • 7
  • 54
  • 51
mgruber4
  • 744
  • 7
  • 11
  • 2
    Found this on the web: http://blogs.blumetech.com/blumetechs-tech-blog/2011/05/faster-python-file-copy.html – mgruber4 Feb 18 '15 at 13:51
  • 2
    Looking at https://github.com/python/cpython/blob/master/Lib/shutil.py they seem to have changed the `copyfileobj` to use a `length` value depending on the OS. It now uses 1MB buffer (instead of 16K) when running on *Windows*. – Chau Jan 10 '19 at 10:17
  • Thank you for this. This is very interesting! – DonkeyKong May 24 '19 at 01:10
  • Will this also work for the function shutil.copyfile instead of shutil.copy? I am sorry, I am a capable but novice python programmer and I don't understand on a deep level how your answer interacts with the shutil copy function. I want to use copyfile so I can rename the file when I put it in the new location – Prince M Jun 13 '19 at 19:53
  • 1
    As python is dynamic, the idea behind the solution was to replace the original function by a patched version, defining a new default buffer size. Open the library sources (Lib/shutil.py). There you will see that the _copyfileobj_patched is almost a clone from the original one, except for the 'default' parameter. Edit: You should also note that shutil.copyfile consumes shutil.copyfileobj... – mgruber4 Jun 15 '19 at 06:03
  • @Chau it looks like the change to 1MB buffer is new in python 3.8: https://bugs.python.org/issue33671 – fantabolous Oct 29 '19 at 08:45
  • Can we have a custom buffer length in shutil? – Farhan Hai Khan May 30 '21 at 16:19
6

Your problem has nothing to do with Python. In fact, the Windows copy process is really poor compared to the Linux system.

You can improve this by using xcopy or robocopy according to this post: Is (Ubuntu) Linux file copying algorithm better than Windows 7?. But in this case, you have to make different calls for Linux and Windows...

import os
import shutil
import sys

source = "source\to\large\file"
target = "destination\directory"

if sys.platform == 'win32':
    os.system('xcopy "%s" "%s"' % (source, target))
else:
    shutil.copy(source, target)

See also:

Community
  • 1
  • 1
Maxime Lorant
  • 34,607
  • 19
  • 87
  • 97
  • I have to settle down by calling system command subprocess.call(["xcopy",source ,target], shell=True). But there is an issue with `shutil.copy()` for sure. – sundar_ima Feb 15 '14 at 17:09
  • This answer will work, however, the answer below provides an explanation for the issue, does not require system specific code, and is pythonic. – DonkeyKong May 24 '19 at 12:44