2

I'm struggling with unzip process with this code:

I have two separated .zip files and each has the same file name and file type, but when I execute this code only appears one file extracted, instead of two.

This is the result:

screenshot

Code:

import os, zipfile

dir_name = 'C:\\Users\\Efste\\Desktop\\Test'
extension = ".zip"

os.chdir(dir_name) # change directory from working dir to dir with files

for item in os.listdir(dir_name):
    if item.endswith(extension):
        file_name = os.path.abspath(item)
        with zipfile.ZipFile(file_name, 'r') as zipObj:
            listOfFileNames = zipObj.namelist()
            for fileName in listOfFileNames:
                zipObj.extract(fileName)
                zipObj.extract(fileName, os.path.basename(item).replace('.zip',''))

What I need is to keep both files by adding an incremental number to the files that are duplicated.

Dharman
  • 30,962
  • 25
  • 85
  • 135
  • Self-contained question. What is the undesired behviour: Did extracting the same file-name from the second ZIP-file overwrite the existing one (that was extracted before)? Can you provide a screenshot / tree-structure of both ZIP-files that shows the pre-conditions and also the files actually extracted? – hc_dev Jul 08 '21 at 20:34
  • The problem is that only extract one file, because each zip contain an item with the same name, should be two files – Emerson Fierro S Jul 08 '21 at 20:53
  • Then you need to check if `fileName` already exists before calling `zipObj.extract`, and create a suitable destination file name. – chepner Jul 08 '21 at 21:30

1 Answers1

1

Without checking if the destination file exists already.

Note:

  • I added debug printing to the console, so you can see what's happening.
import sys
import os
import zipfile
import uuid

def guid1():
    uniqueid = uuid.uuid4()
    guid = str(uniqueid)
    return guid

def zipextract(zip_file, dest_folder):
    print 'reading zip: {}'.format(zip_file)
    myzip = zipfile.ZipFile(zip_file,'r')
    for zib_e in myzip.namelist():
        destination = os.path.abspath(dest_folder)
        filename = os.path.basename(zib_e)
        extension = os.path.splitext(filename)[1]

        extract_to = destination + '/' + filename + "_" + guid1()
        if extension:
            extract_to = extract_to + "." + extension

        if not filename:
            continue
        print "extracting: '{}' to '{}'".format(filename, extract_to)
        data = myzip.read(zib_e)
        output = open(extract_to, 'wb') # exporting to given location one by one
        output.write(data)
        output.close()
        #data.close()
    myzip.close()
    
# execute only if run as a script
if __name__ == "__main__":
    # get command line arguments (0 is the command called, e.g. your script)  
    zip_file = sys.argv[1]
    to_folder = sys.argv[2]

    dir_name = 'C:\\Users\\Efste\\Desktop\\Test'
    os.chdir(dir_name) # change directory from working dir to specified
    for f in os.listdir(dir_name):
        if f.endswith(".zip"):
        zipextract(f, to_folder)

So you would get a unique ID between file_name and extension.

Does this solution work for you as desired?

See also

hc_dev
  • 8,389
  • 1
  • 26
  • 38