15

What is wrong with this code below. I always get FALSE, meaning after compression, decompressed data does not match original value.

public static bool Test()
        {
            string sample = "This is a compression test of microsoft .net gzip compression method and decompression methods";
            System.Text.ASCIIEncoding encoding = new System.Text.ASCIIEncoding();
            byte[] data = encoding.GetBytes(sample);
            bool result = false;

            //Compress
            MemoryStream cmpStream;
            cmpStream = new MemoryStream();
            GZipStream hgs = new GZipStream(cmpStream, CompressionMode.Compress);
            hgs.Write(data, 0, data.Length);
            byte[] cmpData = cmpStream.ToArray();

            MemoryStream decomStream;
            decomStream = new MemoryStream(cmpData);
            hgs = new GZipStream(decomStream, CompressionMode.Decompress);
            hgs.Read(data, 0, data.Length);

            string sampleOut = System.BitConverter.ToString(data);

            result = String.Equals(sample, sampleOut) ;
            return result;
        }

I will really appreciate if you can point out where I am making a mistake.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
MehdiAnis
  • 253
  • 1
  • 2
  • 7

4 Answers4

20

Close the GZipStream after the Write call.

Without calling Close, there's a possibility that some data is buffered and is not written to the underlying stream yet.

Mehrdad Afshari
  • 414,610
  • 91
  • 852
  • 789
  • 7
    @Blindy: I just checked with Reflector. Only `Close` is an option. `Flush` does nothing. – Mehrdad Afshari Oct 19 '09 at 20:27
  • Yes, I tried close after both compress and decompress, still it doesn't work. For some reason hgs.Read( ... ) doesn't read anything :: That's the problem. Problem remains regardless hgs.close() used or not. If you are so sure of close() or flush() can you please paste your WORKING code? Many thanks. =-- Mehdi Anis --= – MehdiAnis Oct 19 '09 at 21:01
  • 3
    Correct - DO NOT FLUSH - Close it. Debugged for an hour until I figured this out. – Brian Webster Feb 15 '10 at 08:46
  • Exactly what I figured after a long time. – Rahul Misra Feb 23 '15 at 10:13
15

Try this code:

public static bool Test()
        {
            string sample = "This is a compression test of microsoft .net gzip compression method and decompression methods";

            System.Text.ASCIIEncoding encoding = new System.Text.ASCIIEncoding();

            byte[] data = encoding.GetBytes(sample);
            bool result = false;

            // Compress
            MemoryStream cmpStream = new MemoryStream();

            GZipStream hgs = new GZipStream(cmpStream, CompressionMode.Compress);

            hgs.Write(data, 0, data.Length);

            byte[] cmpData = cmpStream.ToArray();

            MemoryStream decomStream = new MemoryStream(cmpData);

            hgs = new GZipStream(decomStream, CompressionMode.Decompress);
            hgs.Read(data, 0, data.Length);

            string sampleOut = encoding.GetString(data);

            result = String.Equals(sample, sampleOut);
            return result;
        }

The problem what that you were not using the ASCIIEncoder to get the string back for sampleData.

EDIT: Here's a cleaned up version of the code to help with Closing/Disposing:

public static bool Test()
        {
            string sample = "This is a compression test of microsoft .net gzip compression method and decompression methods";

            System.Text.ASCIIEncoding encoding = new System.Text.ASCIIEncoding();

            byte[] data = encoding.GetBytes(sample);

            // Compress.
            GZipStream hgs;
            byte[] cmpData;

            using(MemoryStream cmpStream = new MemoryStream())
            using(hgs = new GZipStream(cmpStream, CompressionMode.Compress))
            {
                hgs.Write(data, 0, data.Length);
                hgs.Close()

                // Do this AFTER the stream is closed which sounds counter intuitive 
                // but if you do it before the stream will not be flushed
                // (even if you call flush which has a null implementation).
                cmpData = cmpStream.ToArray();
            }  

            using(MemoryStream decomStream = new MemoryStream(cmpData))
            using(hgs = new GZipStream(decomStream, CompressionMode.Decompress))
            {
                hgs.Read(data, 0, data.Length);
            }

            string sampleOut = encoding.GetString(data);

            bool result = String.Equals(sample, sampleOut);
            return result;
        }
rism
  • 11,932
  • 16
  • 76
  • 116
Jason Evans
  • 28,906
  • 14
  • 90
  • 154
  • YES! It works! But the actual problem still remains. It works as values in data not changed. If I reset data[] using " data = new byte[data.Length];" right before calling hgs.read(... .. .), result=false. As, the hgs.Read totally fails. It doesn't read anything. If you put "readCount=hgs.Read(...)" you will see the readCount=0, meaning nothing was read. That's the problem I am facing. hope you can shed some light. Thanks. many thanks to all for quick responses. – MehdiAnis Oct 19 '09 at 20:38
  • 2
    Sorry if I've misunderstood here, but are you saying that if you put 'data = new byte[data.Length];' before 'hgs.Read()' call, then the result is false? This is what I would expect, since the data[] array is being wiped of it's value at that point. Not sure I'm understanding things here, I'm must get some more coffee! :) – Jason Evans Oct 19 '09 at 20:45
  • 2
    I know this is an old question, but @MehdiAnis is correct. This code doesn't work. Observe the return value of hgs.Read(data, 0, data.Length) and you will see that it is zero. – Fantius Dec 21 '11 at 02:03
  • You MUST Close() the compression GZipStream BEFORE copying the compressed bytes to an array. – Fantius Dec 21 '11 at 02:25
  • @JasonEvans I altered the timing of the ToArray() call otherwise as per other comments it just doesn't work. Either that or a down vote and I see no reason to down vote an otherwise perfectly good answer. – rism Apr 17 '15 at 07:07
10

There were three issues to solve the problem. 1. After WRITE GZipStream NEEDED to be closed :: hgs.Close();

  1. GZipStream read needed to be used a WHILE loop and writing the smaller buffer of uncompressed data to a MemoryStream :: outStream.Write( ... );

  2. The converting of decompressed byte[] array needed to use encoding conversion :: string sampleOut = encoding.GetString(data);

Here is the final code:-

public static bool Test()
        {
            string sample = "This is a compression test of microsoft .net gzip compression method and decompression methods";
            System.Text.ASCIIEncoding encoding = new System.Text.ASCIIEncoding();
            byte[] data = encoding.GetBytes(sample);
            bool result = false;

            // Compress 
            MemoryStream cmpStream = new MemoryStream();
            GZipStream hgs = new GZipStream(cmpStream, CompressionMode.Compress, true);

            hgs.Write(data, 0, data.Length);
            hgs.Close();


            //DeCompress
            byte[] cmpData = cmpStream.ToArray();
            MemoryStream decomStream = new MemoryStream(cmpData);

            data = new byte[data.Length];
            hgs = new GZipStream(decomStream, CompressionMode.Decompress, true);

            byte[] step = new byte[16]; //Instead of 16 can put any 2^x
            MemoryStream outStream = new MemoryStream();
            int readCount;

            do
            {
                readCount = hgs.Read(step, 0, step.Length);
                outStream.Write(step, 0, readCount);
            } while (readCount > 0);
            hgs.Close();

            string sampleOut = encoding.GetString(outStream.ToArray());
            result = String.Equals(sample, sampleOut);
            return result; 
        }

I had really trouble to get compress/decompress work with Microsoft .NET GZipStream object. Finally, I think I got it in right way. many thanks to all as the solution came from all of you.

MehdiAnis
  • 253
  • 1
  • 2
  • 7
4

Here's my cleaned up version of the final solution:


  [Test]
  public void Test_zipping_with_memorystream()
  {
   const string sample = "This is a compression test of microsoft .net gzip compression method and decompression methods";
   var encoding = new ASCIIEncoding();
   var data = encoding.GetBytes(sample);
   string sampleOut;
   byte[] cmpData;

   // Compress 
   using (var cmpStream = new MemoryStream())
   {
    using (var hgs = new GZipStream(cmpStream, CompressionMode.Compress))
    {
     hgs.Write(data, 0, data.Length);
    }
    cmpData = cmpStream.ToArray();
   }

   using (var decomStream = new MemoryStream(cmpData))
   {
    using (var hgs = new GZipStream(decomStream, CompressionMode.Decompress))
    {
     using (var reader = new StreamReader(hgs))
     {
      sampleOut = reader.ReadToEnd();
     }
    }
   }

   Assert.IsNotNullOrEmpty(sampleOut);
   Assert.AreEqual(sample, sampleOut);
  }
oPJo
  • 41
  • 1