0

I have a web application which should combine multiple directories and files into one ZIP file and then send this ZIP file to the end user. The end user then automatically receives a ZIP file which is downloaded.

This means that I shouldn't actually store the ZIP on the server disk and that I can generate it in memory so I can pass it along right away. My initial plan was to go with shutil.make_archive but this will store a ZIP file somewhere on the disk. The second problem with make_archive is that it takes a directory but doesn't seem to contain logic to combine multiple directories into one ZIP. Here's the minimal example code for reference:

import shutil
zipper = shutil.make_archive('/tmp/test123', 'zip', 'foldera')

I then started looking at zipfile which is pretty powerful and can be used to loop over both directories and files recursively and applying them to a ZIP. The answer from Jerr on another topic was a great start. The only problem with zipfile.ZipFile seems to be that it cannot ZIP without storing on disk? The code now looks like this:

import os
import zipfile
import io

def zip_directory(path, ziph):
    # ziph is a zipfile handle
    for root, dirs, files in os.walk(path):
        for file in files:
            ziph.write(os.path.join(root, file),
                       os.path.relpath(os.path.join(root, file),
                                       os.path.join(path, '..')))


def zip_content(dir_list, zip_name):
    zipf = zipfile.ZipFile(zip_name, 'w', zipfile.ZIP_DEFLATED)
    for dir in dir_list:
        zip_directory(dir, zipf)
    zipf.close()

zip_content(['foldera', 'folderb'], 'test.zip')

My folder structure looks like this:

foldera
├── test.txt
|── test2.txt
folderb
├── test.py
├── folderc
  ├── script.py

However, the issue is that this ZIP is being stored on the disk. I don't have any use in storing it and as I have to generate thousands of ZIP's a day it would fill up way too much storage.

Is there a way to not store the ZIP and convert the output to a BytesIO, str or other type where I can work with in memory and just dispose it when I'm done? I've also noticed that if you have just a folder without any files in it that it will not be added to the ZIP either (for example a folder 'folderd' without anything in it). Is it possible to add a folder to the ZIP even if if it is empty?

Yenthe
  • 2,313
  • 5
  • 26
  • 46

2 Answers2

0

As far as I know, there would be no way to create ZIPs without storing them on a disk (unless you figure out some hacky way of saving it to memory, but that would be a hog of memory). I noticed you brought up that generating thousands of ZIPs a day would fill up your storage device, but you could simply delete the ZIP after it is sent back to the user. While this would make a file on your server, it would only be temporary and therefor would not require a ton of storage as long as you delete it after it is sent.

0

use BytesIO

filestream=BytesIO()
with zipfile.ZipFile(filestream, mode='w', compression=zipfile.ZIP_DEFLATED) as zipf:
     for dir in dir_list:
          zip_directory(dir, zipf)

you didn't ask how to send it, but for the record, the way I send it in my flask code is:

    filestream.seek(0)
    return send_file(filestream, attachment_filename=attachment_filename, 
                      as_attachment=True, mimetype='application/zip')
dWitty
  • 494
  • 9
  • 22