2

I am compressing the bytes and again while decompressing it I get OOM exception. I am not able to understand why am I getting this error when I have enough memory to store it.

The data is around 20MB after being compressed that is to be decompressed. But I always get OutOfMemory exception.

Below is the code for the same.

public byte[] Compress(byte[] data)
{
    byte[] compressArray = null;
    try
    {
        using (MemoryStream memoryStream = new MemoryStream())
        {
            using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Compress))
            {
                deflateStream.Write(data, 0, data.Length);
                deflateStream.Close();
            }
            compressArray = memoryStream.GetBuffer();
            memoryStream.Dispose();
        }
    }
    catch (Exception exception)
    {
        LogManager.LogEvent(EventLogEntryType.Error, exception.Message);
        return data;
    }
    finally { GC.Collect(); }
    return compressArray;
}

public static byte[] Decompress_Bytes(byte[] data)// Around 20MB data
{
    byte[] decompressedArray = null;
    try
    {
        using (MemoryStream decompressedStream = new MemoryStream())
        {
            using (MemoryStream compressStream = new MemoryStream(data))
            {
                using (DeflateStream deflateStream = new DeflateStream(compressStream, CompressionMode.Decompress))
                {
                    deflateStream.CopyTo(decompressedStream);// Exception thrown at this line.
                    deflateStream.Close();
                }
                compressStream.Dispose();
            }
            decompressedArray = decompressedStream.GetBuffer();
            decompressedStream.Dispose();
        }
    }
    catch (Exception exception)
    {
        return data;
    }
    finally { GC.Collect(); }

    return decompressedArray;
}

Below is the stack trace for better understanding.

at System.IO.MemoryStream.set_Capacity(Int32 value)
at System.IO.MemoryStream.EnsureCapacity(Int32 value)
at System.IO.MemoryStream.Write(Byte[] buffer, Int32 offset, Int32 count)
at System.IO.Stream.InternalCopyTo(Stream destination, Int32 bufferSize)
at System.IO.Stream.CopyTo(Stream destination)
at Symtrax.SQConsole.ConsoleConnectClass.Decompress_Bytes(Byte[] data) in c:\Developement\BI\branch_5.0\MapDesignerUNICODE\ConsoleConnector\SQConsole\ConsoleConnectClass.cs:line 3710

I found many relavant questions regarding this but none of them seem to solve my issue.

Since I have less reputation points I am unable to comment. Hence had to post question. Thanks in advance.

poke
  • 369,085
  • 72
  • 557
  • 602
JDoshi
  • 315
  • 4
  • 12
  • How much memory does your entire process use? – Allan S. Hansen Jul 12 '16 at 07:00
  • @Allan At the time of compressing the bytes `data` is nearly 80MB then after compression the data returned is nearly 17MB. Which is then decompressed. I am disposing the object of memory stream so I guess what happens at the time of `compress()` won't matter. – JDoshi Jul 12 '16 at 07:05
  • Are you doing this decompression only once or multiple times in a row? I.e. should we be focusing on memory leaks as well? – Paul-Jan Jul 12 '16 at 07:06
  • @Paul I am decompressing it only once. – JDoshi Jul 12 '16 at 07:07
  • 1
    One thing to notice, is you don't need to call `dispose` on objects declared in a `using` - also - I'd be interested in knowing memory usage of the actual process - not the amount of data. Take a look in your resource monitor when you run the code and see the size of your working memory usage. Also - I don't think you need all the extra memory streams, as usually streams are very compatible with each other, so the reason might simply be that you "duplicate" your data too many times in memory – Allan S. Hansen Jul 12 '16 at 07:10
  • 2
    You should be using ToArray() instead of GetBuffer(). See http://stackoverflow.com/questions/13053739/when-is-getbuffer-on-memorystream-ever-useful – Ondrej Svejdar Jul 12 '16 at 07:12
  • @OndrejSvejdar The link mentioned something like this "This could be usefull when you're in the situation that you will receive a stream without knowing its size. If the stream received is usually very big, it will be much faster to call GetBuffer() than calling ToArray()". So I don't think `GetBuffer()` could be the issue. – JDoshi Jul 12 '16 at 07:21
  • 1
    The differences between `GetBuffer` and `ToArray`: `ToArray` will copy the data, `GetBuffer` just returns the internal buffer. However, the internal buffer might be (much) larger than the actual data so you need to truncate yourself. – Dirk Vollmar Jul 12 '16 at 07:29
  • 2
    @JDoshi - GetBuffer will return more bytes than was actually written and you're not truncating it - thus you're attempting to unzip something that is not a valid zip and thats why you're in trouble (IMHO). – Ondrej Svejdar Jul 12 '16 at 07:30
  • @AllanS.Hansen Working memory usage for my application goes upto 500MB and total memory used goes upto 6.2GB. I have 8GB memory so 1.8GB still remains free. It is a 32 bit application but even in it the object size is limited 2GB that I don't think is exceeded. As far as duplicate streams are concerned, while decompressing `decompressedStream` is used to write the stream using `deflatestream`. I am converting `data` to stream and passing it to `deflatestream` via `compressStream`. I am unable to get how to streamline it in a better way. Plz suggest a way to achieve the same. – JDoshi Jul 12 '16 at 07:33
  • Have you tried setting `deflateStream`s `Position` back to 0 after copying data to it? – xofz Jul 14 '16 at 20:08

1 Answers1

3

As already stated in the comments you're getting the internal buffer with GetBuffer that has different length characteristics then just calling ToArray.

I have added some dump statements in your code so LINQPad can reveal what is happening:

public byte[] Compress(byte[] data)
{
    byte[] compressArray = null;
    data.Length.Dump("initial array length");
    try
    {
        using (MemoryStream memoryStream = new MemoryStream())
        {
            using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Compress))
            {
                deflateStream.Write(data, 0, data.Length);
                deflateStream.Close();
            }
            memoryStream.GetBuffer().Length.Dump("buffer compress len");
            compressArray = memoryStream.ToArray();
            compressArray.Length.Dump("compress array len");
            // no need to call Dispose, using does that for you
            //memoryStream.Dispose();
        }
    }
    catch (Exception exception)
    {
        exception.Dump();
        return data;
    }
    finally { GC.Collect(); }
    return compressArray;
}

public static byte[] Decompress_Bytes(byte[] data)// Around 20MB data
{
    byte[] decompressedArray = null;
    try
    {
        using (MemoryStream decompressedStream = new MemoryStream())
        {
            using (MemoryStream compressStream = new MemoryStream(data))
            {
                using (DeflateStream deflateStream = new DeflateStream(compressStream, CompressionMode.Decompress))
                {
                    deflateStream.CopyTo(decompressedStream);// Exception thrown at this line.
                    deflateStream.Close();
                }
                // no need, using does that
                //compressStream.Dispose();
            }
            decompressedStream.GetBuffer().Length.Dump("buffer decompress len");
            decompressedArray = decompressedStream.ToArray();
            decompressedArray.Length.Dump("decompress array len");
            // no need, using does that
            decompressedStream.Dispose();
        }
    }
    catch (Exception exception)
    {
        exception.Dump();
        return data;
    }
    finally { GC.Collect(); }

    return decompressedArray;
}

This is the output:

initial array length 248404

buffer compress len 262144

compress array len 189849

buffer decompress len 327680

decompress array len 248404

As you can see from these numbers you'll have very a different length count. You could possible get away with those extra bytes if the Deflate protocol would allow for byte streams that have extra bytes.

The use of GetBuffer instead of ToArray might seem beneficial but I expect the memory allocation and CPU ticks needed for copying of the final array be neglect able, specially if the memory stream is disposed anyway. That fact actually reduces the memory footprint a bit.

If you still insist on re-using the memory stream buffer make sure to also return and provide the actual length in the buffer:

public byte[] Compress(byte[] data, out int len)
{
    byte[] compressArray = null;
    data.Length.Dump("initial array length");
    try
    {
        using (MemoryStream memoryStream = new MemoryStream())
        {
            // keep the stream open, we need the length!
            using (DeflateStream deflateStream = new DeflateStream(
                                       memoryStream,
                                       CompressionMode.Compress, 
                                       true))
            {
                deflateStream.Write(data, 0, data.Length);
                deflateStream.Close();
            }
            // output length
            len = (int) memoryStream.Length;                
            compressArray = memoryStream.GetBuffer();
        }
    }
    catch (Exception exception)
    {
        exception.Dump();
        len =-1;
        return data;
    }
    finally { GC.Collect(); }
    return compressArray;
}

public static byte[] Decompress_Bytes(byte[] data, ref int len)// Around 20MB data
{
    byte[] decompressedArray = null;
    try
    {
        using (MemoryStream decompressedStream = new MemoryStream())
        {
            // use the overload that let us limit the memorystream buffer
            using (MemoryStream compressStream = new MemoryStream(data,0, len))
            {
                // keep the stream open
                using (DeflateStream deflateStream = new DeflateStream(
                                compressStream, 
                                CompressionMode.Decompress, 
                                true))
                {
                    deflateStream.CopyTo(decompressedStream);// Exception thrown at this line.
                    deflateStream.Close();
                }
            }
            // output length
            decompressedArray = decompressedStream.GetBuffer();
            len = (int) decompressedStream.Length;
        }
    }
    catch (Exception exception)
    {
        exception.Dump();
        return data;
    }
    finally { GC.Collect(); }

    return decompressedArray;
}

If you use above code you'll have call it like this:

int len;
var cmp = Compress(Encoding.UTF8.GetBytes(sb.ToString()), out len);
var dec = Decompress_Bytes(cmp,ref len);

Notice to use the bytes in dec you need to only take the first len number of bytes into account. Practically this is done by using Array.Copy which defeats this solution and brings us back to the one which does call ToArray...

rene
  • 41,474
  • 78
  • 114
  • 152