4

I tried the different way to zip folder.My understanding is that Python built-in module always faster than subprocess.call("Linux command"). But I just did some demo. The tarfile module is slow than subprocess.call("tar").Can someone explain it to me?

    #!/usr/bin/python

import os
import time
import tarfile
import subprocess

tStart1 = time.time()

TestFolder = ["Jack", "Robin"]
for folder in TestFolder:
    name = "/mnt/ShareDrive/Share/ExistingUsers/"+folder
    path = "/mnt/TEST2/"
    tar = tarfile.open(path+folder+".tar.gz", "w:gz")
    tar.add(name)
    tar.close()
tEnd1 = time.time()

time.sleep(2)

tStart2 = time.time()
for folder in TestFolder:
    path = "/mnt/TEST1/"
    subprocess.call(["tar", "zcvf", path+folder+".tar.gz", "-P", "/mnt/ShareDrive/Share/ExistingUsers/"+folder])
tEnd2 = time.time()

print "The module cost %f sec" % (tEnd1 - tStart1)
print "The subprocess cost %f sec" % (tEnd2 - tStart2)

The tarfile module cost 63 sec. The subprocess cost only 32 sec.

The total size of two folders is 433 MB

ITnewbie
  • 460
  • 6
  • 23

2 Answers2

10

tar is written in C. The tarfile module is a pure Python implementation of tar handling. There is no way that the module will be faster than the command.

Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
  • Thanks for the explanation. It seems like I have a big misunderstand for Python module. Thank you so much! – ITnewbie Aug 10 '17 at 21:35
2

In my experiments, tarfile with w:gz is much slower than tar cfz, but also gives better compression. So I suppose it's slower because tarfile just has a higher default compression level.

Erik Carstensen
  • 634
  • 4
  • 14
  • Thanks for this - I think the compression level might be a big factor. According to the `tarfile.open` documentation (https://docs.python.org/3/library/tarfile.html#tarfile.open) the default level is 9 (the maximum) for `w:gz`, significantly higher (and slower) than the default for `tar czf` (level 6). I found this answer https://stackoverflow.com/a/45308288/579925 to a different SO question helped me with modifying the compression level for `tarfile.open` . – Peter Briggs Jun 02 '23 at 06:49