8

I try to unzip 150 zip files. All the zip files as different names, and they all spread in one big folder that divided to a lot of sub folders and sub sub folders.i want to extract each archive to separate folder with the same name as the original zip file name and also in the same place as the original zip file . my code is:

import zipfile    
import os,os.path,sys  

pattern = '*.zip'  
folder = r"C:\Project\layers"   
files_process = []  
for root,dirs,files in os.walk(r"C:\Project\layers"):  
    for filenames in files:  
        if filenames == pattern:  
            files_process.append(os.path.join(root, filenames))  
            zip.extract() 

After i run the code nothing happened. Thanks in advance for any help on this.

martineau
  • 119,623
  • 25
  • 170
  • 301
newGIS
  • 598
  • 3
  • 10
  • 26

3 Answers3

16

UPDATE:

Finally, this code worked for me:

import zipfile,fnmatch,os

rootPath = r"C:\Project"
pattern = '*.zip'
for root, dirs, files in os.walk(rootPath):
    for filename in fnmatch.filter(files, pattern):
        print(os.path.join(root, filename))
        zipfile.ZipFile(os.path.join(root, filename)).extractall(os.path.join(root, os.path.splitext(filename)[0]))
newGIS
  • 598
  • 3
  • 10
  • 26
  • 3
    Use `os.path.splitext(filename)[0]` instead of `filename.split('.')[0]` The latter return a wrong result if there are multiple dots in the filename. – jfs Jun 04 '15 at 19:09
7

You could use Path.rglob() to enumerate zip-files recursively and shutil.unpack_archive() to unpack zip files:

#!/usr/bin/env python3
import logging
from pathlib import Path
from shutil import unpack_archive

zip_files = Path(r"C:\Project\layers").rglob("*.zip")
while True:
    try:
        path = next(zip_files)
    except StopIteration:
        break # no more files
    except PermissionError:
        logging.exception("permission error")
    else:
         extract_dir = path.with_name(path.stem)
         unpack_archive(str(path), str(extract_dir), 'zip')

It "extract[s] each archive to separate folder with the same name as the original zip file name and also in the same place as the original zip file" e.g., it extracts 'layers/dir/file.zip' archive into 'layers/dir/file' directory.

jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • i get an error: Traceback (most recent call last): File "D:\desktop\python.py", line 2, in from pathlib import Path ImportError: No module named pathlib – newGIS Feb 05 '15 at 13:48
  • i using python 2.7.8 shell – newGIS Feb 05 '15 at 13:52
  • @Y.Y.C: there is a backport of `pathlib` for Python 2. Run [`pip install pathlib`](https://pypi.python.org/pypi/pathlib/) or [install Python 3](https://www.python.org/downloads/) – jfs Feb 05 '15 at 14:28
  • where i install pathlib? – newGIS Feb 08 '15 at 09:21
  • @Y.Y.C: just run the `pip` command. If you don't know what the command-line is or don't know what `pip` command does; ask a separate question (or (more likely) find an existing one). – jfs Feb 08 '15 at 09:29
  • J.F.Sebastian,i must work with python 2.7.8 ,so i install path lib. but now i get en error: Traceback (most recent call last): File "C:\yaron\shonot\software\gis\tools\YARON_SCRIPTS\unzip.py", line 3, in from shutil import unpack_archive ImportError: cannot import name unpack_archive – newGIS Mar 09 '15 at 07:23
  • @Y.Y.C do you see `unpack_archive()` in [the documentation for `shutil` module](https://docs.python.org/2/library/shutil.html)? You could [use zipfile module, to extract the archive](https://docs.python.org/2/library/zipfile.html). Use `unicode` instead of `str` for filenames on Windows. – jfs Mar 10 '15 at 11:50
  • +1 for [unpack_archive](https://docs.python.org/3/library/shutil.html#shutil.unpack_archive) and [rglob](https://docs.python.org/3/library/pathlib.html#pathlib.Path.rglob). Did not know of either of those. – mgk Jun 04 '15 at 12:10
  • The pathlib module for python2 is not supported anymore. The backport of pathlib for python2 is called [pathlib2](https://pypi.org/project/pathlib2/), so you need to `pip install pathlib2` if still stuck in the 2.x era. – Davos Apr 22 '18 at 13:03
  • @Davos: note: the script has python3 shebang. You should run it using Python 3 interpreter. – jfs Apr 22 '18 at 13:06
  • Agreed, I was just adding new information to your previous comment about the backport, best to use `pathlib2` not `pathlib` – Davos Apr 22 '18 at 13:09
0

To unzip all the files into a temporary folder (Ubuntu)

import tempfile
import zipfile

tmpdirname = tempfile.mkdtemp()

zf = zipfile.ZipFile('/path/to/zipfile.zip')

for fn in zf.namelist():
    temp_file = tmpdirname+"/"+fn
    #print(temp_file)

    f = open(temp_file, 'w')
    f.write(zf.read(fn).decode('utf-8'))
    f.close()
George Fisher
  • 3,046
  • 2
  • 16
  • 15
  • the decoding assumes the files are text – George Fisher May 22 '17 at 18:16
  • I would encourage the usage of `os.path.join(tmpdirname, fn)` instead of `tmpdirname+"/"+fn` – Ivan De Paz Centeno Sep 04 '17 at 14:20
  • You could open the temp_file as binary so you don't need to decode it, and use context manager for the file handle so you don't need the `f.close()` , e.g. `with open(temp_file, 'wb') as f:` Even better you could replace the loop over `namelist` with the `zipfile.extractall` method and supply tmpdirname as the path. This doesn't answer the question though. The question mentions nothing about temp directories, it asks "i want to extract each archive to separate folder with the same name as the original zip file name and also **in the same place** as the original zip file" – Davos Apr 22 '18 at 13:25