2

I'm trying to zip a folder and every contained subfolder and file using os.walk(), but I am having trouble removing the folder path to the root folder - meaning I would like to remove D:\\Users\\Username\\Desktop when opening up the zipfile, but instead open straight to the root folder.

I've been trying to use os.path.basename() and zipfile's arcname argument, but just can't seem to get it right:

def backupToZip(folder):

    import zipfile, os

    folder = os.path.abspath(folder) # make sure folder is absolute

    # Walk the entire folder tree and compress the files in each folder.  
    for foldername, subfolders, filenames in os.walk(folder):

        # Add the current folder to the ZIP file.
        backupZip.write(foldername)

        # Add all the files in this folder to the ZIP file.
        for filename in filenames:
            backupZip.write(os.path.join(foldername, filename))
    backupZip.close()

backupToZip('Sample Folder')
zondo
  • 19,901
  • 8
  • 44
  • 83
cafekaze
  • 377
  • 2
  • 6
  • 18
  • Could you post what you would like the zipped archive structure to look like? Your code as you posted it is a bit confusing because it's recursive without an initial call to the function. – user2027202827 Jun 16 '16 at 02:31
  • http://stackoverflow.com/questions/2212643/python-recursive-folder-read? – Steephen Jun 16 '16 at 02:44
  • @hobenkr: Sorry for any confusion - I would like the zip folder structure to be `Sample Folder\Subfolders and Files\Subfolders and Files` instead of `D:\\Users\\Username\\Desktop\\Sample Folder\\Subfolders and Files\\Subfolders and Files` – cafekaze Jun 16 '16 at 02:48

3 Answers3

1
  1. Use os.chdir to change your current path
  2. Make sure that the parameter of os.walk is relative path

*Should be careful when using os.chdir


import zipfile, os

def backupToZip(folder):

    cwdpath = os.getcwd() # save original path (*where you run this py file)

    saveToWhere = "tmp.zip"
    zf = zipfile.ZipFile(saveToWhere, mode='w')

    folder = os.path.abspath(folder) # make sure folder is absolute
    os.chdir(folder) # change to that absolute path

    # os.walk(relative_path)
    for foldername, subfolders, filenames in os.walk("./"):
        for filename in filenames:
            zf.write(os.path.join(foldername, filename))    
    zf.close()

    os.chdir(cwdpath) # back to original path
Kir Chou
  • 2,980
  • 1
  • 36
  • 48
  • Thanks! - this was really close, but besides missing a `zf.write(foldername)` before `for filename in filesnames`, it also zipped the local disk for some reason I could figure out. – cafekaze Jun 17 '16 at 06:22
1

If you want to avoid chdir, which impacts the whole process, you can use relpath to get the relative path starting from your top folder.

You could use something like

def backupToZip(folder):

    import zipfile, os

    folder = os.path.abspath(folder) # make sure folder is absolute

    # Walk the entire folder tree and compress the files in each folder.  
    for foldername, subfolders, filenames in os.walk(folder):

        if foldername == folder:
             archive_folder_name = ''
        else:
             archive_folder_name = os.path.relpath(foldername, folder)

             # Add the current folder to the ZIP file.
             backupZip.write(foldername, arcname=archive_folder_name)

        # Add all the files in this folder to the ZIP file.
        for filename in filenames:
            backupZip.write(os.path.join(foldername, filename), arcname=os.path.join(archive_folder_name, filename))
    backupZip.close()

backupToZip('Sample Folder')
user2313067
  • 593
  • 1
  • 3
  • 8
  • Thanks! This is very close, just had to add in `backupZip = zipfile.ZipFile('backup.zip', 'w')`. Also, it removed the root folder being zipped, ie it went straight from zipfile to subfolders - anyway to retain the root folder? – cafekaze Jun 17 '16 at 06:36
  • Figured out how to retain root folder per answer below - thanks everyone. – cafekaze Jun 17 '16 at 07:23
1

Made some revisions based on user2313067's answer above and finally got what I wanted in case anyone was curious:

import zipfile, os

def backupToZip(folder):

    # Make sure folder is absolute.
    folder = os.path.abspath(folder) 

    backupZip = zipfile.ZipFile('backup.zip', 'w')

    backupZip.write(folder, arcname=os.path.basename(folder))


    # Walk the entire folder tree and compress the files in each folder.  
    for foldername, subfolders, filenames in os.walk(folder):

        # Add the current folder to the ZIP file if not root folder
        if foldername != folder:
            backupZip.write(foldername, arcname=os.path.relpath(foldername, os.path.dirname(folder)))

        # Add all the files in this folder to the ZIP file.
        for filename in filenames:
            backupZip.write(os.path.join(foldername, filename), arcname=os.path.join(os.path.relpath(foldername, os.path.dirname(folder)), filename))
    backupZip.close()
cafekaze
  • 377
  • 2
  • 6
  • 18
  • 1
    Note: you should avoid `import`s inside functions. Put the at the start of your module/script. – Bakuriu Jun 17 '16 at 07:35
  • @Bakuriu Thanks for the input - could you please help explain why this is better practice? Any other input on shortening/optimizing would be much appreciated as well. – cafekaze Jun 17 '16 at 16:43
  • 1
    First of all to avoid repetition. If in a module you have 3 functions that need module `os` you don't want to add `import os` three times. Moreover putting the imports at the beginning of the file shows clearly the dependencies of a program. Anybody reading it can immediately know if some 3rd party libraries are required and which ones. Finally: hiding the imports means that if the user does not have that required library installed the `ImportError` gets triggered only when the function is called, which may make debugging harder; it's better to fail as early as possible. – Bakuriu Jun 17 '16 at 19:28
  • 1
    Also inside functions you cannot use the wildcard form: `from os import *` does not work inside a function definition but does at the module level. – Bakuriu Jun 17 '16 at 19:28