20

How can i list only the folders from a zip archive? This will list every folfder and file from the archive:

import zipfile
file = zipfile.ZipFile("samples/sample.zip", "r")
for name in file.namelist():
    print name

Thanks.

Neil Aitken
  • 7,856
  • 3
  • 41
  • 40
Pythonpadavan
  • 201
  • 1
  • 2
  • 3

4 Answers4

12

I don't think the previous answers are cross-platform compatible since they're assuming the pathsep is / as noted in some of the comments. Also they ignore subdirectories (which may or may not matter to Pythonpadavan ... wasn't totally clear from question). What about:

import os
import zipfile

z = zipfile.ZipFile('some.zip', 'r')
dirs = list(set([os.path.dirname(x) for x in z.namelist()]))

If you really just want top-level directories, then combine this with agroszer's answer for a final step:

topdirs = [os.path.split(x)[0] for x in dirs]

(Of course, the last two steps could be combined :)

Benjamin Loison
  • 3,782
  • 4
  • 16
  • 33
Dave B.
  • 141
  • 1
  • 4
  • Great solution, but consider the edge case of files that are in the zip's "root" and not in directories. Their `os.path.dirname` would yield `''`, which you may not want listed as a directory. – EliadL Jul 14 '20 at 15:42
  • 1
    There is a typo in the code It should be `z = zipfile.ZipFile('some.zip', 'r')` – bmabir17 May 12 '22 at 16:09
  • @bmabir17 I corrected the typo, thanks to have reported it. – Benjamin Loison Feb 17 '23 at 01:04
12

One way might be to do:

>>> [x for x in file.namelist() if x.endswith('/')]
<<< ['folder/', 'folder2/']
Zach Kelling
  • 52,505
  • 13
  • 109
  • 108
  • I can get the desired folder list from the full list otherways too,but i have more GB of zip with lot of 10000 folders.I want just quicker search. – Pythonpadavan Jun 28 '11 at 17:35
  • @Pythonpadavan:There is a solution but it is not pythonic way of doing.It will work only in Linux.`>>> os.system("unzip -l zip.zip|grep /$") 0 2011-06-28 22:59 zip/one/ 0` Replace `zip.zip` with your `filename` – Kracekumar Jun 28 '11 at 18:18
  • Thanks but the operation sys is given, and guess what; is Windows. – Pythonpadavan Jun 28 '11 at 18:21
  • How can I get just the folders names in root dir without looking any further? Given ```['folder/', 'folder/f1', 'folder/f2', 'folder2/', 'folder/f3', 'folder/f4']```, I wanna get ```['folder', 'folder2']```. – Pedro P. Camellon Oct 07 '21 at 07:44
5

In python 3, this assumes absolute paths are fed to ZipFile:

from zipfile import ZipFile

zip_f = ZipFile("./Filename.zip")

# All directories:
for f in zip_f.namelist():
    zinfo = zip_f.getinfo(f)
    if(zinfo.is_dir()):
        print(f)

# Only root directories:
root_dirs = []
for f in zip_f.namelist():
    zinfo = zip_f.getinfo(f)
    if zinfo.is_dir():
        # This is will work in any OS because the zip format
        # specifies a forward slash.
        r_dir = f.split('/')
        r_dir = r_dir[0]
        if r_dir not in root_dirs:
            root_dirs.append(r_dir)
for d in root_dirs:
    print(d)
Devyzr
  • 299
  • 5
  • 13
  • I had to add os.chdir(directory) before root_dirs = [] and removed if zinfo.is_dir() from my code to make it work. Thank you, this helped a lot. – Nahuatl_C137 Oct 19 '18 at 16:54
  • Thanks @Nahuatl_C137! I think you needed to use chdir because my example requires absolute paths (fixed that), but I'm a bit confused about is_dir() not working, since it only checks for the existence of '/' at the end of the filename. What behavior were you getting? – Devyzr Oct 19 '18 at 18:41
  • I have a zip file, and within it about 40 or so folders, with 700 or so documents. Before removing zinfo.is_dir(), the code was iterating through every single file name (PDFs), i.e. "FolderName/PdfName" and checking for a "/" at the end. I figured I would remove this line and keep the split, and then see the result and adjust if necessary, but it turned out to be just what I needed; a unique list of folder names within a .zip. – Nahuatl_C137 Oct 19 '18 at 19:25
  • That's the purpose though, the ```zinfo.is_dir()``` is supposed to check once so that you don't do the split and check the array for the folder name for every element in the zip, reducing the number of operations. The result would be the same, but the ```is_dir()``` check *should* make it faster. – Devyzr Oct 23 '18 at 16:07
  • https://imgur.com/a/COKVpsF Check it out.. I'm not doing anything differently except taking out that if statement. I don't get a single folder name with it. How could I test if NOT zinfo.is_dir()? I wonder what that would yield. – Nahuatl_C137 Oct 23 '18 at 17:36
  • Weird, what's your zip structure? – Devyzr Nov 02 '18 at 21:15
1

more along the lines

set([os.path.split(x)[0] for x in zf.namelist() if '/' in x])

because python's zipfile does not store just the folders

agroszer
  • 85
  • 7