-1

I am trying to zip each folder on its own in Python. However, the first folder is being zipped and includes all folders within it. Could someone please explain what is going on? Should I not be using shutil for this?

#%% Set path variable
path = r"G:\Folder"
os.chdir(path)
os.getcwd()

#%% Zip all folders
def retrieve_file_paths(dirName):
    # File paths variable
    filePaths = []
    
    # Read all directory, subdirectories and file lists
    for root, directories, files in os.walk(dirName):
        for filename in directories:
            # Createthe full filepath by using os module
            filePath = os.path.join(root, filename)
            filePaths.append(filePath)
            
    # return all paths
    return filePaths

filepaths = retrieve_file_paths(path)

#%% Print folders and start zipping individually
for x in filepaths:
    print(x)
    shutil.make_archive(x, 'zip', path)
Tomerikoo
  • 18,379
  • 16
  • 47
  • 61
Nabih
  • 131
  • 1
  • 2
  • 11
  • What issues are you facing? – The Singularity Sep 30 '21 at 10:30
  • @Luke I can not use zipfile to zip a folder...get a Errno 13 Permission Denied message. When I try and use shutil it takes the first folder, zips it, and includes all folders in root directory within it. Still looking for solutions online – Nabih Sep 30 '21 at 11:35

3 Answers3

1

shutil.make_archive will make an archive of all files and subfolders - since this is what most people want. If you need more choice of what files are included, you must use zipfile directly.

You can do this right within the walk loop (that is what it's for).

import os
import zipfile
dirName = 'C:\...'
# Read all directory, subdirectories and file lists
for root, directories, files in os.walk(dirName):
     zf = zipfile.ZipFile(os.path.join(root, "thisdir.zip"), "w", compression=zipfile.ZIP_DEFLATED, compresslevel=9)
     for name in files:
         if name == 'thisdir.zip': continue
         filePath = os.path.join(root, name)
         zf.write(filePath, arcname=name)
     zf.close()

This will create a file "thisdir.zip" in each subdirectory, containing only the files within this directory.

(edit: tested & corrected code example)

Torben Klein
  • 2,943
  • 1
  • 19
  • 24
  • This just creates a zip file called thisdir.zip with all folders in root directory in it. – Nabih Sep 30 '21 at 11:58
  • 1
    Try it out - the zip files won't include any folders, only files. Each zip file is saved in its respective subfolder (you can change that though). Maybe you should explain more precisely what you mean by "zip each folder on its own"? – Torben Klein Sep 30 '21 at 12:16
  • Got my head around it in the end! Thanks! – Nabih Oct 04 '21 at 11:57
0

Following Torben's answer to my question, I modified the code to zip each directory recursively. I realised what had happened was that I was not specifying sub directories. Code below:

#Set path variable
path = r"insert root directory here"
os.chdir(path)

# Declare the functionto return all file paths in selected directory
def retrieve_file_paths(dirName):
   for root, dirs, files in os.walk(dirName):
       for dir in dirs:
        zf = zipfile.ZipFile(os.path.join(root+dir, root+dir+'.zip'), "w", compression=zipfile.ZIP_DEFLATED, compresslevel=9)
        files = os.listdir(root+dir)
        print(files)
        filePaths.append(files)
        for f in files:
            filepath = root + dir +'/'+ f
            zf.write(filepath, arcname=f)
        zf.close()

retrieve_file_paths(path)
Nabih
  • 131
  • 1
  • 2
  • 11
-1

it's a relativly simple answer once you got a look onto the Docs.

You can see the following under shutil.make_archive: Note This function is not thread-safe.

The way threading in computing works on a high level basis: On a machine there are cores, which can process data. (e.g. AMD Ryzen 5 5800x with 8cores) Within a process, there are threads (e.g. 16 Threads on the Ryzen 5800X).

However, in multiprocessing there is no data shared between the processes. In multithreading within one process you can access data from the same variable.

Because this function is not thread-safe, you will share the variable "x" and access the same item. Which means there can only be one output.

Have a look into multithreading and works with locks in order to sequelize threads.

Cheers

ValentinB
  • 71
  • 7
  • 2
    downvote: this has nothing to do with threading whatsoever. – Torben Klein Sep 30 '21 at 11:17
  • Appreciate the input however I was not receiving errors due to inability of multithreading? That being said, I resorted to using zipfile rather than shutil to zip folders the way I wanted it to. Some good resources: https://stackoverflow.com/questions/2033879/what-does-threadsafe-mean#:~:text=If%20this%20bank%20account%20is,try%20to%20access%20the%20object. and https://www.mindprod.com/jgloss/threadsafe.html – Nabih Oct 04 '21 at 11:54