15

I've been learning python for about 3 weeks now, and I'm currently trying to write a little script for sorting files (about 10.000) by keywords and date appearing in the filename. Files before a given date should be added to an archive. The sorting works fine, but not the archiving

It creates an archive - the name is fine - but in the archive is the complete path to the files. If i open it, it looks like: folder1 -> folder2 -> folder3 -> files.

How can I change it such that the archive only contains the files and not the whole structure?

Below is a snippet with my zip function, node is the path where the files were before sorting, folder is a subfolder with the files sorted by a keyword in the name, items are the folders with files sorted by date.

I am using Python 2.6

def ZipFolder(node, zipdate):
    xynode = node + '/xy'
    yznode = node + '/yz'
    for folder in [xynode,yznode]:
        items = os.listdir(folder)
        for item in items:
            itemdate = re.findall('(?<=_)\d\d\d\d-\d\d', item)
            print item
            if itemdate[0] <= zipdate:
                arcname = str(item) + '.zip'
                x = zipfile.ZipFile(folder + '/' + arcname, mode='w', compression = zipfile.ZIP_DEFLATED)
                files = os.listdir(folder + '/' + item)
                for f in files:
                    x.write(folder + '/' + item + '/' + f)
                    print 'writing ' + str(folder + '/' + item + '/' + f) + ' in ' + str(item)
                x.close()
                shutil.rmtree(folder + '/' + item)
    return

I am also open to any suggestions and improvements.

Shawn Chin
  • 84,080
  • 19
  • 162
  • 191
rny
  • 153
  • 1
  • 1
  • 4

1 Answers1

21

From help(zipfile):

 |  write(self, filename, arcname=None, compress_type=None)
 |      Put the bytes from filename into the archive under the name
 |      arcname.

So try changing your write() call with:

x.write(folder + '/' + item + '/' + f, arcname = f)

About your code, it seems to me good enough, especially for a 3 week pythonist, although a few comments would have been welcomed ;-)

rodrigo
  • 94,151
  • 12
  • 143
  • 190
  • 2
    Not comments but docstring. Explain what the function does and what is it useful for, not how it works and the docstring is most useful for that. – Jan Hudec Aug 10 '11 at 09:28
  • 1
    @Jan Hudec: with docstring you mean sth like this: def ZipFolder(node, zipdate): """zips files older than 'zipdate' in the subfolders of 'node'""" – rny Aug 10 '11 at 09:49
  • @rny: Yes. It's the standard way to document functions in python. – Jan Hudec Aug 10 '11 at 10:19
  • 2
    Get rid of that ugly joining. x.write(folder, item, f, arcname = f, sep='/') – SpringField Jan 21 '14 at 16:22
  • @SpringField: That function is not a `print()`... it is not even a `file.write()`, and in the docs I don't see any reference to the argument `sep`. – rodrigo Jan 22 '14 at 08:42
  • @rodrigo , Python Cookbook - page 78. – SpringField Jan 22 '14 at 14:04
  • @SpringField: I don't know what page says, but... `x.write(folder, item, arcname=f) -> TypeError: got multiple values for keyword argument 'arcname'`. `x.write(..., arcname=f, sep='/') -> TypeError: got unexpected keyword argument 'sep'`. – rodrigo Jan 22 '14 at 14:20
  • @rodrigo , a picture from my interpreter with the "sep" in action. http://imgbox.com/jNp6FKd6 – SpringField Jan 22 '14 at 14:45
  • @SpringField: That's the `print()` function that does have a `sep` function. The one in my answer is a `zipfile.ZipFile.write()` function, and that has nothing to do with the former. No `sep`, no `end`, no `file`, no multiple arguments; that can be seen in the help string I quoted in the answer. – rodrigo Jan 22 '14 at 16:46
  • 5
    While @Springfield is technically incorrect, it does seem like it would be more correct to use: `x.write(os.path.join(folder,item,f),arcname=f)`. – tufelkinder Nov 13 '14 at 05:27
  • I knew this answer and yet came looking because, the documentation specifically says the opposite: Write the file named filename to the archive, giving it the archive name arcname (by default, this will be the same as filename, **but without a drive letter and with leading path separators removed**). – trss Aug 07 '15 at 15:48
  • I think they're referring only to the name of the file and not the directory structure, which will remain as it is in the filename. – trss Aug 07 '15 at 15:50
  • Please add an explanation as to which part exactly omits the directory path. Duplicates are redirected here without a proper explanation. – Arete Aug 13 '22 at 13:12
  • @Arete: The `write()` function takes two parameters: the first one is the name of a local file whose data is to be added to the zip file; the second one is the name under which that data will be stored. So a call to `z.write('c:\\users\\rodrigo\\Desktop\\My Document.doc', 'd.doc')` will store that file from my desktop as a plain `d.doc` inside the root directory of the zip file. If instead of `d.doc` you write `foo/d.doc` then it will _create_ a directory `foo` inside the zip file and store the `d.doc` archive there. – rodrigo Aug 16 '22 at 07:43