2

I am searching and searching since hours to create a valid .tar.gz file using streams in Delphi 10.

I was able to solve the tarball part using LibTar, which works well.

After some searching I also found examples to decompress gzip data using just System.ZLib. The secret lies in the WindowBits parameter:

// 31 bit wide window = gzip only mode
DecompStream:= TZDecompressionStream.Create(SourceStream, 15 + 16);
TarStream:= TTarArchive.Create(DecompStream);
TarStream.Reset;
while TarStream.FindNext(DirRec) do {...} TarStream.ReadFile(TargetStream);

Great! But is it really possible that System.ZLib is able to decompress gzip (I guess by just ignoring the gzip header by that +16?), but is not able to create such header by itself? Whatever I try, I only get a file that cannot be opened by 7zip or WinRar, because the header is missing.

Maybe it just can't work, because the gzip header contains a checksum, so it's not possible to write the header without knowing the following data. How to solve this? Edit: this is wrong, see comments: crc32 is in the trailer.

It seems, many others also have this problem - I found and tried multiple solutions to add this header, but nothing really worked and everything requires adding long units (not nice but acceptable) or even DLLs (not acceptable for me).

maf-soft
  • 2,335
  • 3
  • 26
  • 49
  • 1
    https://stackoverflow.com/questions/1838699/how-can-i-decompress-a-gzip-stream-with-zlib – David Heffernan Oct 15 '18 at 12:12
  • So you are telling me (at least if feels like - I can only guess), I could have known about compressing gzip with delphi by finding and reading a non-delphi question about DEcompression using zlib, and there the second answer titled with "python", which just mentions compression by that small "(de-)" somewhere in the middle of the text, and that makes my question and answer obsolete? Yes, it's all there, somewhere. But for me it was really hard to find, so I hope my question and answer will help others now. If you google for "delphi gzip compression" you will hopefully understand the problem. – maf-soft Oct 15 '18 at 12:29
  • No, I'm just adding a link to another pertinent question so that the two questions are linked together. You can see the list of linked topics on the right hand side. – David Heffernan Oct 15 '18 at 13:15
  • https://stackoverflow.com/questions/32498990/decompress-deflatestream-c-in-delphi Here's another one that is Delphi related. – David Heffernan Oct 15 '18 at 13:50
  • 1
    This unit contains a good implementation of compress/uncompress stream in the Gzip format: https://github.com/mike-lischke/GraphicEx/blob/master/3rd%20party/DelphiZlib/ZLibExGZ.pas – silvioprog Feb 22 '19 at 14:51

1 Answers1

2

The secret lies in the WindowBits parameter - sounds familiar? :)

Believe it or not, compressing to gzip just works the same way! I couldn't find this anywhere using Google, or in the Embarcadero documentation/help. But have a look at this comment in the System.ZLib source of Delphi Tokyo:

Add 16 to windowBits to write a simple gzip header and trailer around the compressed data instead of a zlib wrapper. The gzip header will have no file name, no extra data, no comment, no modification time (set to zero), no header crc, and the operating system will be set to 255 (unknown).

It works:

TargetStream:= TFileStream.Create(TargetFilename, fmCreate);
CompressStream:= TZCompressionStream.Create(TargetStream, zcDefault, 15 + 16);
TarStream:= TTarWriter.Create(CompressStream);
TarStream.AddStream(SourceStream1, SourceFilename1, Now);
TarStream.AddString(SourceString2, SourceFilename2, Now);
maf-soft
  • 2,335
  • 3
  • 26
  • 49
  • 1
    There is always a CRC to verify file integrity in the gzip format. What is not there is a separate _header_ CRC, for just the 10-byte header. Not a problem. – Mark Adler Oct 15 '18 at 19:03
  • Aaaah, thanks, @MarkAdler! Hmm, but that means that the `TZCompressionStream` has to buffer all the data in memory, because it has to wait for the end to know size and crc for putting it to the header (just checked the header format). I didn't expect that. Ok, for the records: I remove my "I think it's bad not to have a checksum to verify file integrity." from the answer... – maf-soft Oct 15 '18 at 19:38
  • 1
    No, the gzip format puts the size and CRC in the trailer, not the header. – Mark Adler Oct 16 '18 at 00:26
  • That was also my first logical conclusion, but I wasn't able to verify this here http://zlib.org/rfc-gzip.html#header-trailer - the title mentions a trailer, but not the text. So I found some posting somewhere else, that the trailer is just a #0 -> 10+1=11 (wrong). - But now, thanks for your help, I scrolled a page up from this chapter and understand :) thank you for helping and your work. It can all be very misleading when you try to save time, especially when you make wrong assumptions, like thinking that windowBits might be something to skip unknown headers on decompression. (continued...) – maf-soft Oct 16 '18 at 06:05
  • When you google around, you find many projects using zlib for the compression, but manually putting a gzip wrapper around it, using a lot of own code - again a source for wrong assumptions. Even now with knowing what to look for, I don't see many people using it correctly (I only looked for Delphi code!). Is it new? Since when is this supported, @MarkAdler? I'm new to it, and anyway not the linux guy, so to me, it was a Delphi system unit and Embarcadero should add such important things in their documentation. I hope this question will help others with the same problem. – maf-soft Oct 16 '18 at 06:15
  • 1
    Looks like in-memory gzip was added in zlib 1.2.1, in 2003. – Mark Adler Oct 16 '18 at 06:22
  • I know it's not good to make this conversation any longer, but let me remark: if someone has any better tar code than I'm using, please tell :) (but it works and I'm happy with this) - maybe it's an idea to include tar support in future zlib versions? ;-) Would be really great, but no reply necessary. – maf-soft Oct 16 '18 at 06:28