I am trying to read a zip file in python that was written with pkzip:
import zipfile
fname = "myfile.zip"
unzipped = zipfile.ZipFile(fname, "r")
But get this error:
unzipped = zipfile.ZipFile(fname, "r")
File "/home/username/anaconda3/envs/c1/lib/python3.7/zipfile.py", line 1222, in __init__
self._RealGetContents()
File "/home/username/anaconda3/envs/c1/lib/python3.7/zipfile.py", line 1285, in _RealGetContents
endrec = _EndRecData(fp)
File "/home/username/anaconda3/envs/c1/lib/python3.7/zipfile.py", line 282, in _EndRecData
return _EndRecData64(fpin, -sizeEndCentDir, endrec)
File "/home/username/anaconda3/envs/c1/lib/python3.7/zipfile.py", line 228, in _EndRecData64
raise BadZipFile("zipfiles that span multiple disks are not supported")
zipfile.BadZipFile: zipfiles that span multiple disks are not supported
As far as I can tell, this file does not span multiple disks. I say this because:
Checking against the solution in this Stackoverflow answer, my version of zipfile was appropriately patched.
It unzips fine with:
$ unzip myfile.zip
on the linux command line.
So, it doesn't seem to actually be a bad zip file. Reading the first few bytes by opening it with raw file access, there is a suggestive header that PKzip may be formatting this file in an interesting way:
b'PK\x03
Examining the python library documentation for zipfile, there is an PKZIP application note:
The ZIP file format is a common archive and compression standard. This module provides tools to create, read, write, append, and list a ZIP file. Any advanced use of this module will require an understanding of the format, as defined in PKZIP Application Note.
Which links here. This is very thorough, but I don't see concrete instruction on how to add which options into the call to zipfile in order to parse the file correctly.
PKZIP is in fairly wide use, so I'm surprised to not find more common examples or native support. What options are necessary to open a pkzipped file in python that throws this multiple-disk error?