0

I have a file structure something like this:

/a.zip

    /not_a_zip/

        contents

    /b.zip

        contents

and I want to create a directory a and extract a.zip into it and all the nested zipped files where they are so I get something like this:

/a/

    /not_a_zip/

        contents

    /b/

        contents

I tried this solution, but I was getting errors because inside my main directory I have subdirectories, as well as zip files.

I want to be able to extract the main zip file into a directory of the same name, then be able to extract all nested files within, no matter how deeply nested they are.

EDIT: my current code is this

archive = zipfile.ZipFile(zipped, 'r')
for file in archive.namelist():
    archive.extract(file, resultDirectory)

for f in [filename for filename in archive.NameToInfo if filename.endswith(".zip")]:
    # get file name and path to extract
    fileToExtract = resultDirectory + '/' + f
    # get directory to extract new file to
    directoryToExtractTo = fileToExtract.rsplit('/', 1)
    directoryToExtractTo = directoryToExtractTo[0] + '/'
    # extract nested file
    nestedArchive = zipfile.ZipFile(fileToExtract, 'r')
    for file in nestedArchive.namelist():
        nestedArchive.extract(fileToExtract, directoryToExtractTo)

but I'm getting this error:

KeyError: "There is no item named 'nestedFileToExtract.zip' in the archive"

Even though it exists in the file system

Eric F
  • 1
  • 2

1 Answers1

-1

Based on this other solutions: this and this.

import os
import io
import sys
import zipfile


def extract_with_structure(input_file, output):
    with zipfile.ZipFile(input_file) as zip_file:
        print(f"namelist: {zip_file.namelist()}")
        for obj in zip_file.namelist():
            filename = os.path.basename(obj)

            if not filename:
                # Skip folders
                continue

            if 'zip' == filename.split('.')[-1]:
                # extract a zip
                content = io.BytesIO(zip_file.read(filename))
                f = zipfile.ZipFile(content)
                dirname = os.path.splitext(os.path.join(output, filename))[0]
                for i in f.namelist():
                    f.extract(i, dirname)
            else:
                # extract a file
                zip_file.extract(obj, os.path.join(output))


if __name__ == "__main__":
    if len(sys.argv) < 3:
        print("No zipfile specified or output folder.")
        exit(1)
    
    extract_with_structure(sys.argv[1], sys.argv[2])
  • 1
    I am still getting the same KeyError when using your code block. It seems the code doesn't extract a directory with the same name as the root zip file in the file structure like this: /directory_to_extract.zip >/directory_to_extract/ – Eric F Dec 22 '20 at 21:55
  • Try to run the script with: python3 script.py file.zip output_folder – juananthony Dec 22 '20 at 22:41