3

I am trying to download a zip file to a local drive and extract all files to a destination folder.

so i have come up with solution but it is only to "download" a file from a directory to another directory but it doesn't work for downloading files. for the extraction, I am able to get it to work in 2.6 but not for 2.5. so any suggestions for the work around or another approach I am definitely open to. thanks in advance.

######################################
'''this part works but it is not good for URl links''' 
import shutil

sourceFile = r"C:\Users\blueman\master\test2.5.zip"
destDir = r"C:\Users\blueman\user"
shutil.copy(sourceFile, destDir)
print "file copied"
######################################################

'''extract works but not good for version 2.5'''
import zipfile

GLBzipFilePath =r'C:\Users\blueman\user\test2.5.zip'
GLBextractDir =r'C:\Users\blueman\user'

def extract(zipFilePath, extractDir):
 zip = zipfile(zipFilePath)
 zip.extractall(path=extractDir)
 print "it works"

extract(GLBzipFilePath,GLBextractDir)

######################################################
Isaiah
  • 4,201
  • 4
  • 27
  • 40
marcus
  • 41
  • 1
  • 1
  • 3

3 Answers3

14

urllib.urlretrieve can get a file (zip or otherwise;-) from a URL to a given path.

extractall is indeed new in 2.6, but in 2.5 you can use an explicit loop (get all names, open each name, etc). Do you need example code?

So here's the general idea (needs more try/except if you want to give a nice error message in each and every case which could go wrong, of which, of course, there are a million variants -- I'm only using a couple of such cases as examples...):

import os
import urllib
import zipfile

def getunzipped(theurl, thedir):
  name = os.path.join(thedir, 'temp.zip')
  try:
    name, hdrs = urllib.urlretrieve(theurl, name)
  except IOError, e:
    print "Can't retrieve %r to %r: %s" % (theurl, thedir, e)
    return
  try:
    z = zipfile.ZipFile(name)
  except zipfile.error, e:
    print "Bad zipfile (from %r): %s" % (theurl, e)
    return
  for n in z.namelist():
    dest = os.path.join(thedir, n)
    destdir = os.path.dirname(dest)
    if not os.path.isdir(destdir):
      os.makedirs(destdir)
    data = z.read(n)
    f = open(dest, 'w')
    f.write(data)
    f.close()
  z.close()
  os.unlink(name)
Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
  • yes i do i am a super newbie in python. thanks for the directions – marcus Nov 21 '09 at 20:27
  • I was tinkering with the script for some time and i have to come back.although "for n in z.namelist():" refers to all files. I can't seem to be able to unzip folders within the zipfile and maintain the file structure in the zip file. thanks once more – marcus Dec 04 '09 at 03:41
  • @marcus, the code I gave works great for me: why not post exactly what error you get rather than the totally generic "can't seem to be able"?! Obviously nobody can help you without information. – Alex Martelli Dec 04 '09 at 05:23
2

For downloading, look at urllib:

import urllib
webFile = urllib.urlopen(url)

For unzipping, use zipfile. See also this example.

Mark Byers
  • 811,555
  • 193
  • 1,581
  • 1,452
  • The example I linked to probably works in Python 2.5 as it does not use the new function ZipFile.extractall. – Mark Byers Nov 21 '09 at 22:54
2

The shortest way i've found so far, is to use +alex answer, but with ZipFile.extractall() instead of the loop:

from zipfile import ZipFile
from urllib import urlretrieve
from tempfile import mktemp

filename = mktemp('.zip')
destDir = mktemp()
theurl = 'http://www.example.com/file.zip'
name, hdrs = urlretrieve(theurl, filename)
thefile=ZipFile(filename)
thefile.extractall(destDir)
thefile.close()
Ohad Cohen
  • 5,756
  • 3
  • 39
  • 36