Using Gzip to compress/decompress an array of bytes

Question

I need to compress an array of bytes. So I wrote this snippet :

 class Program
    {
        static void Main()
        {
            var test = "foo bar baz";

            var compressed = Compress(Encoding.UTF8.GetBytes(test));
            var decompressed = Decompress(compressed);
            Console.WriteLine("size of initial table = " + test.Length);
            Console.WriteLine("size of compressed table = " + compressed.Length);
            Console.WriteLine("size of  decompressed table = " + decompressed.Length);
            Console.WriteLine(Encoding.UTF8.GetString(decompressed));
            Console.ReadKey();
        }

        static byte[] Compress(byte[] data)
        {
            using (var compressedStream = new MemoryStream())
            using (var zipStream = new GZipStream(compressedStream, CompressionMode.Compress))
            {
                zipStream.Write(data, 0, data.Length);
                zipStream.Close();
                return compressedStream.ToArray();
            }
        }

        static byte[] Decompress(byte[] data)
        {
            using (var compressedStream = new MemoryStream(data))
            using (var zipStream = new GZipStream(compressedStream, CompressionMode.Decompress))
            using (var resultStream = new MemoryStream())
            {
                zipStream.CopyTo(resultStream);
                return resultStream.ToArray();
            }
        }
    }

The problem is that I get this output :

I don't understand why the size of the compressed array is greater than the decompressed one !

Any ideas?

Edit

after @spender's comment: if I change test string for example :

var test = "foo bar baz very long string for example hdgfgfhfghfghfghfghfghfghfghfghfghfghfhg";

I get different result. So what is the minimum size of the initial array to be compressed ?

Because the data is so small that the overheads of the compression format outweigh the gain of compression. Try more data. Note: completely random data will not compress. — spender, Dec 01 '16 at 11:09
@spender plz see my edit and post your idea as an answer, thanks — Lamloumi Afif, Dec 01 '16 at 11:14

score 9 · Accepted Answer · answered Dec 01 '16 at 11:16

9

Compressed file has headers and it increases the file size, when the input size is very small the output can be even bigger as you see. try it with a file with bigger size.

answered Dec 01 '16 at 11:16

Ashkan Mobayen Khiabani

33,575
33
102
171

score 3 · Answer 2 · edited May 23 '17 at 10:29

This is because the amount of data is so small that the overheads of the compression format outweigh the gain of compression.

Try more data.

If you compressed entirely random data (or already compressed data such as jpeg), you would never make any significant gain. However the string new String('*',1000000) would compress down really nicely.

GZIP adds at least 18 bytes, so anything below, or marginally above this size that is easily compressible will not benefit.

Here's an interesting question that probes further into GZIP: What's the most that GZIP or DEFLATE can increase a file size?

Using Gzip to compress/decompress an array of bytes

2 Answers2