0

I am trying to decompress a .gz file that comes from a material board with custom headers using python.

I have as example a C++ file that decompress that file using the inflateInit2 function from the zlib the following way :

inflateInit2(&stream,MAX_WBITS + 16);

where stream is a z_stream. My usage of the lib in python is as follow :


f = open("../myfile.gz", "rb").read()
zlib.decompress(f, zlib.MAX_WBITS)

where I change the second parameter of the decompress function trying several possibilities, because the python library does not include the inflateInit2 methode.

I've found this topic that gives various example of usage of the lib, but none of them work (or the C++ calculus of W_BITS) in my case and always generate one of these error :

  • invalid block type
  • incomplete or truncated stream
  • inconsistent stream state

The C++ code seems to be deleting the generic .gz header to use a custom one because the compression algorithm contain this line that is then prepend to the data.

uint8_t headerGz[] = {0x1F, 0x8b, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00};

So my guess is that these header does not match the usual ones and that is why either gunzip or libz cannot decompress the file. Maybe also the z_stream way of doing treat the file in a different way.

Am I missing something ? Many thanks ! Kev'

Kevin Heirich
  • 109
  • 2
  • 12

1 Answers1

1
import gzip
with gzip.open("../myfile.gz", "rb") as gz:
    gz.read()
Mark Adler
  • 101,978
  • 13
  • 118
  • 158
  • Hey! This raise an `EOFError: Compressed file ended before the end-of-stream marker was reached`, so my guess is that the file do not comply with the `libz` standards, as I have no control on that file I have to search this way. – Kevin Heirich Jul 25 '22 at 07:28
  • That code will work the same as `inflate()` with your invocation of `inflateInit2()`. So whatever issue the Python code is seeing, so would your C++ example. – Mark Adler Jul 25 '22 at 15:11
  • You would need to provide the C++ code here so we could see what it's doing with the header. The pre-pending of eight bytes is odd, since the minimal gzip header is ten bytes. – Mark Adler Jul 25 '22 at 15:12