52

I am having an issue with this test function where I take an in memory string, compress it, and decompress it. The compression works great, but I can't seem to get the decompression to work.

//Compress
System.IO.MemoryStream outStream = new System.IO.MemoryStream();                
GZipStream tinyStream = new GZipStream(outStream, CompressionMode.Compress);
mStream.Position = 0;
mStream.CopyTo(tinyStream);

//Decompress    
outStream.Position = 0;
GZipStream bigStream = new GZipStream(outStream, CompressionMode.Decompress);
System.IO.MemoryStream bigStreamOut = new System.IO.MemoryStream();
bigStream.CopyTo(bigStreamOut);

//Results:
//bigStreamOut.Length == 0
//outStream.Position == the end of the stream.

I believe that bigStream out should at least have data in it, especially if my source stream (outStream) is being read. is this a MSFT bug or mine?

makerofthings7
  • 60,103
  • 53
  • 215
  • 448

10 Answers10

111

What happens in your code is that you keep opening streams, but you never close them.

  • In line 2, you create a GZipStream. This stream will not write anything to the underlying stream until it feels it’s the right time. You can tell it to by closing it.

  • However, if you close it, it will close the underlying stream (outStream) too. Therefore you can’t use mStream.Position = 0 on it.

You should always use using to ensure that all your streams get closed. Here is a variation on your code that works.

var inputString = "“ ... ”";
byte[] compressed;
string output;

using (var outStream = new MemoryStream())
{
    using (var tinyStream = new GZipStream(outStream, CompressionMode.Compress))
    using (var mStream = new MemoryStream(Encoding.UTF8.GetBytes(inputString)))
        mStream.CopyTo(tinyStream);

    compressed = outStream.ToArray();
}

// “compressed” now contains the compressed string.
// Also, all the streams are closed and the above is a self-contained operation.

using (var inStream = new MemoryStream(compressed))
using (var bigStream = new GZipStream(inStream, CompressionMode.Decompress))
using (var bigStreamOut = new MemoryStream())
{
    bigStream.CopyTo(bigStreamOut);
    output = Encoding.UTF8.GetString(bigStreamOut.ToArray());
}

// “output” now contains the uncompressed string.
Console.WriteLine(output);
Timwi
  • 65,159
  • 33
  • 165
  • 230
  • 8
    +1 good answer Timwi. Just to add to this, GZip has some internal buffering of data it needs to do in order to compress. It can't know that its done receiving data until you close it and therefore it doesn't spit out the last few bytes and decompression of the partial stream fails. – MerickOWA Sep 15 '10 at 22:29
  • I think we're on .NET 3.5 (working with Unity), so .CopyTo doesn't exist yet. Looking elsewhere on SO for how to copy from one stream to another: http://stackoverflow.com/questions/230128/best-way-to-copy-between-two-stream-instances – Almo May 22 '14 at 14:24
  • Thanks for this, i've been having trouble figuring out exactly how to arrange the streams to get the correct output in both direction – MikeT Apr 15 '16 at 13:29
36

This is a known issue: http://blogs.msdn.com/b/bclteam/archive/2006/05/10/592551.aspx

I have changed your code a bit so this one works:

var mStream = new MemoryStream(new byte[100]);
var outStream = new System.IO.MemoryStream();

using (var tinyStream = new GZipStream(outStream, CompressionMode.Compress))
{
    mStream.CopyTo(tinyStream);           
}

byte[] bb = outStream.ToArray();

//Decompress                
var bigStream = new GZipStream(new MemoryStream(bb), CompressionMode.Decompress);
var bigStreamOut = new System.IO.MemoryStream();
bigStream.CopyTo(bigStreamOut);
Indy9000
  • 8,651
  • 2
  • 32
  • 37
Aliostad
  • 80,612
  • 21
  • 160
  • 208
  • 2
    Shouldn't `new GZipStream(outStream, CompressionMode.Compress, true)` be used to leave the stream opened so that the using statement can close it as suggested in @briantyler's answer? – bounav Jul 25 '19 at 11:39
9

The way to compress and decompress to and from a MemoryStream is:

public static Stream Compress(
    Stream decompressed, 
    CompressionLevel compressionLevel = CompressionLevel.Fastest)
{
    var compressed = new MemoryStream();
    using (var zip = new GZipStream(compressed, compressionLevel, true))
    {
        decompressed.CopyTo(zip);
    }

    compressed.Seek(0, SeekOrigin.Begin);
    return compressed;
}

public static Stream Decompress(Stream compressed)
{
    var decompressed = new MemoryStream();
    using (var zip = new GZipStream(compressed, CompressionMode.Decompress, true))
    {
        zip.CopyTo(decompressed);
    }

    decompressed.Seek(0, SeekOrigin.Begin);
    return decompressed;
}

This leaves the compressed / decompressed stream open and as such usable after creating it.

satnhak
  • 9,407
  • 5
  • 63
  • 81
5

Another implementation, in VB.NET:

Imports System.Runtime.CompilerServices
Imports System.IO
Imports System.IO.Compression

Public Module Compressor

    <Extension()> _
    Function CompressASCII(str As String) As Byte()

        Dim bytes As Byte() = Encoding.ASCII.GetBytes(str)

        Using ms As New MemoryStream

            Using gzStream As New GZipStream(ms, CompressionMode.Compress)

                gzStream.Write(bytes, 0, bytes.Length)

            End Using

            Return ms.ToArray

        End Using

    End Function

    <Extension()> _
    Function DecompressASCII(compressedString As Byte()) As String

        Using ms As New MemoryStream(compressedString)

            Using gzStream As New GZipStream(ms, CompressionMode.Decompress)

                Using sr As New StreamReader(gzStream, Encoding.ASCII)

                    Return sr.ReadToEnd

                End Using

            End Using

        End Using

    End Function

    Sub TestCompression()

        Dim input As String = "fh3o047gh"

        Dim compressed As Byte() = input.CompressASCII()

        Dim decompressed As String = compressed.DecompressASCII()

        If input <> decompressed Then
            Throw New ApplicationException("failure!")
        End If

    End Sub

End Module
Nikolai Koudelia
  • 2,494
  • 1
  • 26
  • 28
  • 1
    Even if your post is 6 years old and I am not the questioneer - it helped me today. Thank you! – muffi Jan 31 '19 at 12:05
2
    public static byte[] compress(byte[] data)
    {
        using (MemoryStream outStream = new MemoryStream())
        {
            using (GZipStream gzipStream = new GZipStream(outStream, CompressionMode.Compress))
            using (MemoryStream srcStream = new MemoryStream(data))
                srcStream.CopyTo(gzipStream);
            return outStream.ToArray();
        }
    }

    public static byte[] decompress(byte[] compressed)
    {
        using (MemoryStream inStream = new MemoryStream(compressed))
        using (GZipStream gzipStream = new GZipStream(inStream, CompressionMode.Decompress))
        using (MemoryStream outStream = new MemoryStream())
        {
            gzipStream.CopyTo(outStream);
            return outStream.ToArray();
        }
    }
PhonPanom
  • 385
  • 5
  • 8
2

If you are attempting to use the MemoryStream (e.g. passing it into another function) but receiving the Exception "Cannot access a closed Stream." then there is another GZipStream constructor you can use that will help you.

By passing in a true to the leaveOpen parameter, you can instruct GZipStream to leave the stream open after disposing of itself, by default it closes the target stream (which I didn't expect). https://msdn.microsoft.com/en-us/library/27ck2z1y(v=vs.110).aspx

using (FileStream fs = File.OpenRead(f))
using (var compressed = new MemoryStream())
{
    //Instruct GZipStream to leave the stream open after performing the compression.
    using (var gzipstream = new GZipStream(compressed, CompressionLevel.Optimal, true))
        fs.CopyTo(gzipstream);

    //Do something with the memorystream
    compressed.Seek(0, SeekOrigin.Begin);
    MyFunction(compressed);
}
Snives
  • 1,226
  • 11
  • 21
2

I had an issue where *.CopyTo(stream)* would end up with a byte[0] result. The solution was to add .Position=0 before calling .CopyTo(stream) Answered here

I also use a BinaryFormatter that would throw an 'End of stream encountered before parsing was completed' exception if position was not set to 0 before deserialization. Answered here

This is the code that worked for me.

 public static byte[] SerializeAndCompressStateInformation(this IPluginWithStateInfo plugin, Dictionary<string, object> stateInfo)
    {
        byte[] retArr = new byte[] { byte.MinValue };
        try
        {
            using (MemoryStream msCompressed = new MemoryStream())//what gzip writes to
            {
                using (GZipStream gZipStream = new GZipStream(msCompressed, CompressionMode.Compress))//setting up gzip
                using (MemoryStream msToCompress = new MemoryStream())//what the settings will serialize to
                {
                    BinaryFormatter formatter = new BinaryFormatter();
                    //serialize the info into bytes
                    formatter.Serialize(msToCompress, stateInfo);
                    //reset to 0 to read from beginning byte[0] fix.
                    msToCompress.Position = 0;
                    //this then does the compression
                    msToCompress.CopyTo(gZipStream);
                }
                //the compressed data as an array of bytes
                retArr = msCompressed.ToArray();
            }
        }
        catch (Exception ex)
        {
            Logger.Error(ex.Message, ex);
            throw ex;
        }
        return retArr;
    }


    public static Dictionary<string, object> DeserializeAndDecompressStateInformation(this IPluginWithStateInfo plugin, byte[] stateInfo)
    {
        Dictionary<string, object> settings = new Dictionary<string, object>();
        try
        {

            using (MemoryStream msDecompressed = new MemoryStream()) //the stream that will hold the decompressed data
            {
                using (MemoryStream msCompressed = new MemoryStream(stateInfo))//the compressed data
                using (GZipStream gzDecomp = new GZipStream(msCompressed, CompressionMode.Decompress))//the gzip that will decompress
                {
                    msCompressed.Position = 0;//fix for byte[0]
                    gzDecomp.CopyTo(msDecompressed);//decompress the data
                }
                BinaryFormatter formatter = new BinaryFormatter();
                //prevents 'End of stream encountered' error
                msDecompressed.Position = 0;
                //change the decompressed data to the object
                settings = formatter.Deserialize(msDecompressed) as Dictionary<string, object>;
            }
        }
        catch (Exception ex)
        {
            Logger.Error(ex.Message, ex);
            throw ex;
        }
        return settings;
    }
CobyC
  • 2,058
  • 1
  • 18
  • 24
1

I thought I would share this answer for anyone interested in reproducing this on PowerShell, the code is mostly inspired from Timwi's helpful answer, however unfortunately as of now there is no implementation for the using statement like on C# for PowerShell, hence the need to manually dispose the streams before output.

Functions below requires PowerShell 5.0+.

Improved versions of these 2 functions as well as Compression From File Path and Expansion from File Path can be found in this repo as well as in the PowerShell Gallery.

using namespace System.Text
using namespace System.IO
using namespace System.IO.Compression
using namespace System.Collections
using namespace System.Management.Automation
using namespace System.Collections.Generic
using namespace System.Management.Automation.Language

Add-Type -AssemblyName System.IO.Compression

class EncodingCompleter : IArgumentCompleter {
    [IEnumerable[CompletionResult]] CompleteArgument (
        [string] $commandName,
        [string] $parameterName,
        [string] $wordToComplete,
        [CommandAst] $commandAst,
        [IDictionary] $fakeBoundParameters
    ) {
        [CompletionResult[]] $arguments = foreach($enc in [Encoding]::GetEncodings().Name) {
            if($enc.StartsWith($wordToComplete)) {
                [CompletionResult]::new($enc)
            }
        }
        return $arguments
    }
}
  • Compression from string to Base64 GZip compressed string:
function Compress-GzipString {
    [cmdletbinding()]
    param(
        [Parameter(Mandatory, ValueFromPipeline)]
        [string] $String,

        [Parameter()]
        [ArgumentCompleter([EncodingCompleter])]
        [string] $Encoding = 'utf-8',

        [Parameter()]
        [CompressionLevel] $CompressionLevel = 'Optimal'
    )

    try {
        $enc       = [Encoding]::GetEncoding($Encoding)
        $outStream = [MemoryStream]::new()
        $gzip      = [GZipStream]::new($outStream, [CompressionMode]::Compress, $CompressionLevel)
        $inStream  = [MemoryStream]::new($enc.GetBytes($string))
        $inStream.CopyTo($gzip)
    }
    catch {
        $PSCmdlet.WriteError($_)
    }
    finally {
        $gzip, $outStream, $inStream | ForEach-Object Dispose
    }

    try {
        [Convert]::ToBase64String($outStream.ToArray())
    }
    catch {
        $PSCmdlet.WriteError($_)
    }
}
  • Expansion from Base64 GZip compressed string to string:
function Expand-GzipString {
    [cmdletbinding()]
    param(
        [Parameter(Mandatory, ValueFromPipeline)]
        [string] $String,

        [Parameter()]
        [ArgumentCompleter([EncodingCompleter])]
        [string] $Encoding = 'utf-8'
    )

    try {
        $enc       = [Encoding]::GetEncoding($Encoding)
        $bytes     = [Convert]::FromBase64String($String)
        $outStream = [MemoryStream]::new()
        $inStream  = [MemoryStream]::new($bytes)
        $gzip      = [GZipStream]::new($inStream, [CompressionMode]::Decompress)
        $gzip.CopyTo($outStream)
        $enc.GetString($outStream.ToArray())
    }
    catch {
        $PSCmdlet.WriteError($_)
    }
    finally {
        $gzip, $outStream, $inStream | ForEach-Object Dispose
    }
}

And for the little Length comparison, querying the Loripsum API:

$loremIp = Invoke-RestMethod loripsum.net/api/10/long
$compressedLoremIp = Compress-GzipString $loremIp

$loremIp, $compressedLoremIp | Select-Object Length

Length
------
  8353
  4940

(Expand-GzipString $compressedLoremIp) -eq $loremIp # => Should be True
Santiago Squarzon
  • 41,465
  • 5
  • 14
  • 37
0

If you still need it, you can use GZipStream constructor wit boolean argument (there are two such constructors) and pass true value there:

tinyStream = new GZipStream(outStream, CompressionMode.Compress, true);

In that case, when you close your tynyStream, your out stream will be still opened. Don't forget to copy data:

mStream.CopyTo(tinyStream);
tinyStream.Close();

Now you've got memory stream outStream with zipped data

Bugs and kisses for U

Good luck

0

Please refer to below link, It is avoid to use double MemoryStream. https://stackoverflow.com/a/53644256/1979406

Haryono
  • 2,184
  • 1
  • 21
  • 14