1

This thread here advises to use shutilto zip files:

import shutil
shutil.make_archive(output_filename, 'zip', dir_name)

This zips everything in dir_name and maintains the folder structure in it. Is it possible to use this same library to remove all sub-folders and just zip all files in dir_name into the same level? Or must I introduce a separate code chunk to first consolidate the files? For eg., this is a hypothetical folder structure:

\dir_name
   \dir1
       \cat1
          file1.txt
          file2.txt
       \cat2
          file3.txt
   \dir2
       \cat3
          file4.txt

Output zip should just contain:

file1.txt
file2.txt
file3.txt
file4.txt
Tristan Tran
  • 1,351
  • 1
  • 10
  • 36

2 Answers2

2

shutil.make_archive does not have a way to do what you want without copying files to another directory, which is inefficient. Instead you can use a compression library directly similar to the linked answer you provided. Note this doesn't handle name collisions!

import zipfile
import os

with zipfile.ZipFile('output.zip','w',zipfile.ZIP_DEFLATED,compresslevel=9) as z:
    for path,dirs,files in os.walk('dir_name'):
        for file in files:
            full = os.path.join(path,file)
            z.write(full,file) # write the file, but with just the file's name not full path

# print the files in the zipfile
with zipfile.ZipFile('output.zip') as z:
    for name in z.namelist():
        print(name)

Given:

dir_name
├───dir1
│   ├───cat1
│   │       file1.txt
│   │       file2.txt
│   │
│   └───cat2
│           file3.txt
│
└───dir2
    └───cat3
            file4.txt

Output:

file1.txt
file2.txt
file3.txt
file4.txt
Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251
  • @Tristan: From what I read, another advantage of not using `shutil.make_archive`—besides not needing to copy files to another directory first—is that using the `zipfile` module directly will actually compress the data of files in the archive, potentially making the resulting archive much smaller depending on what's in the files. – martineau Jul 14 '21 at 23:19
  • Thanks for the code and advice. I think this would work for after reading around further. – Tristan Tran Jul 15 '21 at 14:59
1
# The root directory to search for
path = r'dir_name/'

import os
import glob

# List all *.txt files in the root directory
file_paths = [file_path 
              for root_path, _, _ in os.walk(path) 
              for file_path in glob.glob(os.path.join(root_path, '*.txt'))]

import tempfile

# Create a temporary directory to copy your files into
with tempfile.TemporaryDirectory() as tmp:
    import shutil

    for file_path in file_paths:
        # Get the basename of the file
        basename = os.path.basename(file_path)

        # Copy the file to the temporary directory  
        shutil.copyfile(file_path, os.path.join(tmp, basename))

    # Zip the temporary directory to the working directory
    shutil.make_archive('output', 'zip', tmp)

This will create a output.zip file in the current working directory. The temporary directory will be deleted when the end of the context manager is reached.

enzo
  • 9,861
  • 3
  • 15
  • 38
  • Thanks. This was somewhat I was thinking of if shutil cannot flatten the directory directly. This works. But @Mark Tolonen's solution seems a little more robust. I like both solutions, to be honest. – Tristan Tran Jul 15 '21 at 14:57