0

I have lots of csv files contained in different 7z files. I want to find specific csv files in those 7z files and save them decompressed in a different directory. I have tried

import os
import py7zlib

tree = r'Where_the_7zfiles_are_stored'
dst = r'Where_I_want_to_store_the_csvfiles'

for dirpath, dirname, filename in os.walk(tree):
    for myfile in filename:
        if myfile.endswith('2008-01-01_2008-04-30_1.7z'):
            myZip = py7zlib.Archive7z(open(os.path.join(dirpath,myfile), 'rb'))
            csvInZipFile = zip(myZip.filenames,myZip.files)
            for myCsvFileName, myCsvFile in csvInZipFile:
                if '2008-01' in myCsvFileName:
                    with open(os.path.join(dst,myCsvFileName),'wb') as outfile:
                        outfile.write(myCsvFile.read())

but I get the following error

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\'\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 682, in runfile
execfile(filename, namespace)
  File "C:\Users\'\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 85, in execfile
    exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)
  File "C:/Users//'/Documents/Example/unzipfiles.py", line 23, in <module>
outfile.write(myCsvFile.read())
  File "C:\Users\'\Anaconda3\lib\site-packages\py7zlib.py", line 576, in read
    data = getattr(self, decoder)(coder, data)
  File "C:\Users\'\Anaconda3\lib\site-packages\py7zlib.py", line 634, in _read_lzma
return self._read_from_decompressor(coder, dec, input, checkremaining=True, with_cache=True)
  File "C:\Users\'\Anaconda3\lib\site-packages\py7zlib.py", line 611, in _read_from_decompressor
    tmp = decompressor.decompress(data)
ValueError: data error during decompression

The odd thing is that the method seems to work fine for the first two csv files. I have no idea how to get to the root of the problem. At least the data in the csv files do not seem to be different. Manually unpacking the different csv files using IZArc goes without problem. (The problem occurred in both python 2.7 and 3.4).

I have also tried to use the lzma module, but here I could not figure out how to retrieve the different csv files contained in the 7z file.

user3820991
  • 2,310
  • 5
  • 23
  • 32
  • It works for me, but I do have a couple questions: You seem to be using "wb" for mode, are these binary csv files ? Can you provide a failing 7z file ? Maybe simplify internal payload total to one of the two that works and one of the ones that fails Are you 100% certain that the destination directory contains the appropriate file structure ? If you are not then you may want to be certain to [automatically create the parent directory](http://stackoverflow.com/questions/12517451/python-automatically-creating-directories-with-file-output) – OYRM Apr 23 '15 at 16:29
  • The problem is the `myCsvFile.read()` part. How can I provide the respective 7z files? – user3820991 Apr 24 '15 at 11:00
  • Yes, the issue is occurring there, as seen in your trace ... Sharing the 7z file would be useful diagnostically, but if you don't have a means of doing so, then I can recommend an experiment. First, some more details. Does this always fail on the same 7z file ? If so, does it always fail while reading the same CSV file from the 7z file ? – OYRM Apr 24 '15 at 12:40

0 Answers0