-2

I have a zip file which contains some regular files. This file is uploaded to a fileserver. Now I am trying to compute the sha256 checksum for the zip file, then write the checksum into a *.sha256sum file and upload to the fileserver as well.

Then when one downloads the zip file and the checksum file (.sha256sum) from the fileserver, he/she computes again the sha256 of the zip file and compare it with the one stored as text in the checksum file (.sha256sum) just downloaded.

When I try to compute the sha256 checksum of the zip file i get an error.

with open(filename) as f:
    data = f.read()
    hash_sha256 = hashlib.sha256(data).hexdigest()

The error is the following and it is thrown in line data = f.read():

in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 44: character maps to <undefined>
Willy
  • 9,848
  • 22
  • 141
  • 284
  • This has **nothing to do with** the hash computation - as indicated by where the error message is raised. (In the future, please show a [complete](https://meta.stackoverflow.com/questions/359146/) error traceback.) The problem is that you attempt to open a file *that does not represent text, in text mode*. – Karl Knechtel May 17 '23 at 01:39

1 Answers1

2

You must open the file in binary mode:

with open(filename, 'rb') as f:
    data = f.read()
    hash_sha256 = hashlib.sha256(data).hexdigest()

Per Reading and Writing files:

Normally, files are opened in text mode, that means, you read and write strings from and to the file, which are encoded in a specific encoding.

So, there's something going on under the hood to make it usable text, which you don't want.

Appending a 'b' to the mode opens the file in binary mode. Binary mode data is read and written as bytes objects.