I have lots of csv files contained in different 7z files. I want to find specific csv files in those 7z files and save them decompressed in a different directory. I have tried
import os
import py7zlib
tree = r'Where_the_7zfiles_are_stored'
dst = r'Where_I_want_to_store_the_csvfiles'
for dirpath, dirname, filename in os.walk(tree):
for myfile in filename:
if myfile.endswith('2008-01-01_2008-04-30_1.7z'):
myZip = py7zlib.Archive7z(open(os.path.join(dirpath,myfile), 'rb'))
csvInZipFile = zip(myZip.filenames,myZip.files)
for myCsvFileName, myCsvFile in csvInZipFile:
if '2008-01' in myCsvFileName:
with open(os.path.join(dst,myCsvFileName),'wb') as outfile:
outfile.write(myCsvFile.read())
but I get the following error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\'\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 682, in runfile
execfile(filename, namespace)
File "C:\Users\'\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 85, in execfile
exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)
File "C:/Users//'/Documents/Example/unzipfiles.py", line 23, in <module>
outfile.write(myCsvFile.read())
File "C:\Users\'\Anaconda3\lib\site-packages\py7zlib.py", line 576, in read
data = getattr(self, decoder)(coder, data)
File "C:\Users\'\Anaconda3\lib\site-packages\py7zlib.py", line 634, in _read_lzma
return self._read_from_decompressor(coder, dec, input, checkremaining=True, with_cache=True)
File "C:\Users\'\Anaconda3\lib\site-packages\py7zlib.py", line 611, in _read_from_decompressor
tmp = decompressor.decompress(data)
ValueError: data error during decompression
The odd thing is that the method seems to work fine for the first two csv files. I have no idea how to get to the root of the problem. At least the data in the csv files do not seem to be different. Manually unpacking the different csv files using IZArc goes without problem. (The problem occurred in both python 2.7 and 3.4).
I have also tried to use the lzma module, but here I could not figure out how to retrieve the different csv files contained in the 7z file.