I once downloaded a web page using curl
, and the resulting file contains the compressed HTML code. I would like to decompress it.
I tried this Python code
import gzip
f = gzip.open(file_name, 'rb')
file_content = f.read()
f.close()
which results in the following error: gzip.BadGzipFile: Not a gzipped file (b'\x1f\xc2')
.
\x1f
and \xc2
are the first two bytes of the file. That is confirmed by:
with open(file_name, "rb") as f :
binary_file_content = f.read()
for i in range(12):
print(binary_file_content[i], end=" ")
which prints the first few bytes of the file: 31 194 139 8 0 0 0 0 0 0 3 195
(where 31 and 194 are decimal values of previously seen 1F and C2).
Do the first bytes provide a hint as to which decompressing method should be used? (I made a few tests with zlib.decompress
but that failed so far.)
Edit: The output of file myCompressedFile
is data
.