6

Is there a way to know if the byte[] has been compressed (or not) by GzipStream .net class?

EDIT: Just want to know if the byte[] array has been compressed (since I will always be using GzipStream to compress and decompress)

stackoverflowuser
  • 22,212
  • 29
  • 67
  • 92

4 Answers4

8

A GZipStream is a DeflateStream with an additional header and trailer.

The format is specified in RFC 1952.


The .NET 4.0 GZipStream class writes the following bytes as header:

byte[] headerBytes = new byte[] { 0x1f, 0x8b, 8, 0, 0, 0, 0, 0, 4, 0 };
if (compressionLevel == 10)
{
    headerBytes[8] = 2;
}

The trailer consists of a CRC32 checksum and the length of the uncompressed data.

dtb
  • 213,145
  • 36
  • 401
  • 431
  • thanks. where did compressionLevel come from? Also i just want to know if the byte[] has been compressed (doesnt matter if gzipstream or deflatestream since I always use GzipStream. I guess I need to rephrase my question). – stackoverflowuser Jan 11 '11 at 21:33
  • @stackoverflowuser: It's not possible to determine reliably if a random sequence of bytes represents compressed data or not, if the bytes don't have a header like a GZipStream. – dtb Jan 11 '11 at 21:37
  • thanks for the explanation. sorry but i still dont get the use ot he code snippet. I have a byte[] and it seems the above code snippet is using compressionLevel (??) setting headerBytes. Can you pls. update the code snippet considering an input of byte[] – stackoverflowuser Jan 11 '11 at 21:43
3

Thanks to @dtd's explaination, this works for me: (@stackoverflowuser, you may want this?)

public static class CompressionHelper
{
    public static byte[] GZipHeaderBytes = {0x1f, 0x8b, 8, 0, 0, 0, 0, 0, 4, 0};
    public static byte[] GZipLevel10HeaderBytes = {0x1f, 0x8b, 8, 0, 0, 0, 0, 0, 2, 0};

    public static bool IsPossiblyGZippedBytes(this byte[] a)
    {
        var yes = a.Length > 10;

        if (!yes)
        {
            return false;
        }

        var header = a.SubArray(0, 10);

        return header.SequenceEqual(GZipHeaderBytes) || header.SequenceEqual(GZipLevel10HeaderBytes);
    }
}
Matías Fidemraizer
  • 63,804
  • 18
  • 124
  • 206
Jeff Tian
  • 5,210
  • 3
  • 51
  • 71
  • how do you use this helper? Helper to what? – Artem A Nov 10 '15 at 18:00
  • 1
    A helper is merely a static class that you can use anywhere without the need to instantiate an object first. I use it in the response pipes of Http Modules: `CompressHelper.IsPossiblyGZippedBytes(responseStream);` – Jeff Tian Nov 12 '15 at 08:49
  • It works as expect in .net 3.5 and 4.5 Although I have to implement the SubArray method. For more information about How to implement SubArray method?see the following link: https://stackoverflow.com/questions/943635/getting-a-sub-array-from-an-existing-array – Luis Fernando Camacho Camacho Mar 12 '18 at 20:35
1

you could look at the first few bytes for the magic header to see if it is gzipped, but unless the .net compressor writes additional info into one of the comment or other optional fields, you probably can't tell who the compressor was.

http://www.onicos.com/staff/iz/formats/gzip.html

you could also look at the OS type field to see if it was FAT or NTFS, but that still doesn't tell you it was written by C#...

John Gardner
  • 24,225
  • 5
  • 58
  • 76
0

The answers posted above do the job most of the times, but be aware that the last byte in the header denotes the operation system id. This means that this code might work locally, but might not work when deployed to Azure. In my case, that last byte was not 0 as in the example, but 0xA (10).

See als https://en.wikipedia.org/wiki/Gzip for a little explanation of the header.