I'm trying to zip a folder in Python 3 with the module zipfile
.
Since I'm german I have some filenames containing umlauts (äöü).
While zipping, I get a UnicodeEncodeError: 'utf-8' codec can't encode character '\udcfc' in position 95: surrogates not allowed
.
The character in question is an ü
.
How can I get zipfile
to zip all my files?
The relevant code is this:
def zipdir(path, ziph):
for root, dirs, files in os.walk(path):
for file in files:
ziph.write(os.path.join(root, file))
if __name__ == '__main__':
zipf = zipfile.ZipFile('path/to/destination', 'w', zipfile.ZIP_DEFLATED)
zipdir('path/to/folder', zipf)
zipf.close()
Edit:
I've got the same error when I'm using shutil.make_archive
.
import shutil
shutil.make_archive('/path/to/destination', 'zip', '/path/to/folder')
Full stacktrace of shutil.make_archive()
:
Traceback (most recent call last):
File "/usr/lib64/python3.7/zipfile.py", line 452, in _encodeFilenameFlags
return self.filename.encode('ascii'), self.flag_bits
UnicodeEncodeError: 'ascii' codec can't encode character '\udcfc' in position 59: ordinal not in range(128)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "run.py", line 39, in <module>
archive_dir(path, zip_fullpath)
File "run.py", line 19, in archive_dir
shutil.make_archive(dest, 'zip', source)
File "/home/sean/.local/share/virtualenvs/backup-script-QUcRKrDQ/lib/python3.7/shutil.py", line 822, in make_archive
filename = func(base_name, base_dir, **kwargs)
File "/home/sean/.local/share/virtualenvs/backup-script-QUcRKrDQ/lib/python3.7/shutil.py", line 720, in _make_zipfile
zf.write(path, path)
File "/usr/lib64/python3.7/zipfile.py", line 1746, in write
with open(filename, "rb") as src, self.open(zinfo, 'w') as dest:
File "/usr/lib64/python3.7/zipfile.py", line 1473, in open
return self._open_to_write(zinfo, force_zip64=force_zip64)
File "/usr/lib64/python3.7/zipfile.py", line 1586, in _open_to_write
self.fp.write(zinfo.FileHeader(zip64))
File "/usr/lib64/python3.7/zipfile.py", line 442, in FileHeader
filename, flag_bits = self._encodeFilenameFlags()
File "/usr/lib64/python3.7/zipfile.py", line 454, in _encodeFilenameFlags
return self.filename.encode('utf-8'), self.flag_bits | 0x800
UnicodeEncodeError: 'utf-8' codec can't encode character '\udcfc' in position 59: surrogates not allowed
Full stacktrace of zipfile
:
Traceback (most recent call last):
File "/usr/lib64/python3.7/zipfile.py", line 452, in _encodeFilenameFlags
return self.filename.encode('ascii'), self.flag_bits
UnicodeEncodeError: 'ascii' codec can't encode character '\udcfc' in position 95: ordinal not in range(128)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "run.py", line 41, in <module>
zipdir(path, zipf)
File "run.py", line 16, in zipdir
ziph.write(filepath)
File "/usr/lib64/python3.7/zipfile.py", line 1746, in write
with open(filename, "rb") as src, self.open(zinfo, 'w') as dest:
File "/usr/lib64/python3.7/zipfile.py", line 1473, in open
return self._open_to_write(zinfo, force_zip64=force_zip64)
File "/usr/lib64/python3.7/zipfile.py", line 1586, in _open_to_write
self.fp.write(zinfo.FileHeader(zip64))
File "/usr/lib64/python3.7/zipfile.py", line 442, in FileHeader
filename, flag_bits = self._encodeFilenameFlags()
File "/usr/lib64/python3.7/zipfile.py", line 454, in _encodeFilenameFlags
return self.filename.encode('utf-8'), self.flag_bits | 0x800
UnicodeEncodeError: 'utf-8' codec can't encode character '\udcfc' in position 95: surrogates not allowed
Update:
I've tried some solutions that seemed to work for some at the posted link. This is what I've got:
with
ziph.write(filepath.encode('utf8','surrogateescape').decode('ISO-8859-1'))
I got:
Traceback (most recent call last):
File "run.py", line 41, in <module>
zipdir(path, zipf)
File "run.py", line 16, in zipdir
ziph.write(filepath.encode('utf8','surrogateescape').decode('ISO-8859-1'))
File "/usr/lib64/python3.7/zipfile.py", line 1713, in write
zinfo = ZipInfo.from_file(filename, arcname)
File "/usr/lib64/python3.7/zipfile.py", line 506, in from_file
st = os.stat(filename)
FileNotFoundError: [Errno 2] No such file or directory: '/some/path/to/documents/DIS_Broschüre_DE.pdf'
So the encoding/decoding returned something that can not be found in the file system.
The other option: ziph.write(filepath.encode('utf8','surrogateescape').decode('utf-8'))
got me
Traceback (most recent call last):
File "run.py", line 41, in <module>
zipdir(path, zipf)
File "run.py", line 16, in zipdir
ziph.write(filepath.encode('utf8','surrogateescape').decode('utf-8'))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 96: invalid start byte