1

I have a reporting tool that sends query requests to a server. After the query is done by the server the result is sent back to the requesting reporting tool. The communication is done using WCF.

The queried data, stored in a DataSet object, is very large and is usually round about 100mb big.

To fasten the transmission I serialize (BinaryFormatter) and compress the DataSet.The transmitted object between the server and reporting tool is a byte array.

However after a few requests the reporting tool throws an OutOfMemoryException when it tries to deserialize the DataSet. The exception is thrown when I call:

dataSet = (DataSet) formatter.Deserialize(dstream);

dstream is the DeflateStream used to decompress the transmitted compressed byte array.

The exception occurs in a sub call of formatter.Deserialize when the byte array is created out of the stream.

Is there any other way of binary serialization that has a better mechanism to prevent this exception?

Implementation:

The method to serialize and compress the DataSet (used by the server)

public static byte[] Compress(DataSet dataSet)
{
    using (var input = new MemoryStream())
    {
        var binaryFormatter = new BinaryFormatter();
        binaryFormatter.Serialize(input, dataSet);

        using (var output = new MemoryStream())
        {
            using (var compressor = new DeflateStream(output, CompressionLevel.Optimal))
            {
                input.Position = 0;

                var buffer = new byte[1024];

                int read;

                while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
                    compressor.Write(buffer, 0, read);

                compressor.Close();

                return output.ToArray();
            }
        }
    }
}

The method to decompress and deserialize the DataSet (used by the reporting tool)

public static DataSet Decompress(byte[] data)
{
    DataSet dataSet;

    using (var input = new MemoryStream(data))
    {
        using (var dstream = new DeflateStream(input, CompressionMode.Decompress))
        {
            var formatter = new BinaryFormatter();
            dataSet = (DataSet) formatter.Deserialize(dstream);
        }
    }

    return dataSet;
}

Stacktrace:

at System.Array.InternalCreate(Void* elementType, Int32 rank, Int32* pLengths, Int32* pLowerBounds)
at System.Array.CreateInstance(Type elementType, Int32 length)
at System.Array.UnsafeCreateInstance(Type elementType, Int32 length)
at System.Runtime.Serialization.Formatters.Binary.ObjectReader.ParseArray(ParseRecord pr)
at System.Runtime.Serialization.Formatters.Binary.ObjectReader.ParseObject(ParseRecord pr)
at System.Runtime.Serialization.Formatters.Binary.ObjectReader.Parse(ParseRecord pr)
at System.Runtime.Serialization.Formatters.Binary.__BinaryParser.ReadArray(BinaryHeaderEnum binaryHeaderEnum)
at System.Runtime.Serialization.Formatters.Binary.__BinaryParser.Run()
at System.Runtime.Serialization.Formatters.Binary.ObjectReader.Deserialize(HeaderHandler handler, __BinaryParser serParser, Boolean fCheck, Boolean isCrossAppDomain, IMethodCallMessage methodCallMessage)
at System.Runtime.Serialization.Formatters.Binary.BinaryFormatter.Deserialize(Stream serializationStream, HeaderHandler handler, Boolean fCheck, Boolean isCrossAppDomain, IMethodCallMessage methodCallMessage)
at System.Runtime.Serialization.Formatters.Binary.BinaryFormatter.Deserialize(Stream serializationStream)
at DRX.PTClientMonitoring.Infrastructure.Helper.DataSetCompressor.Decompress(Byte[] data) in c:\_develop\PTClientMonitoringTool\PTClientMonitoringTool\Source\DRX.PTClientMonitoring.Infrastructure\Helper\DataSetCompressor.cs:line 51
at DRX.PTClientMonitoring.Reporting.ViewModels.ShellViewModel.<>c__DisplayClassf.<ExecudeDefinedQuery>b__4() in c:\_develop\PTClientMonitoringTool\PTClientMonitoringTool\Source\DRX.PTClientMonitoring.Reporting\ViewModels\ShellViewModel.cs:line 347
Pang
  • 9,564
  • 146
  • 81
  • 122
Anton Sterr
  • 103
  • 1
  • 7
  • It will be helpful if you could share relevant code? – FaizanHussainRabbani Feb 26 '18 at 09:42
  • On the sending side, do you set `dataSet.RemotingFormat = SerializationFormat.Binary;` as recommended [here](https://stackoverflow.com/a/4685413/3744182)? Also, can you share the traceback at which the out-of-memory exception occurs? I'm under the impression that `BinaryFormatter` normally does read streams incrementally. – dbc Feb 26 '18 at 09:51
  • Take a look at the methods used by the server and reporting tool. The garbage collection call before the actual deserialization could not prevent the exception. I am using a DataSet because it contains multiple tables depending on the query request of the reporting tool. – Anton Sterr Feb 26 '18 at 09:57
  • The stacktrace is added. I have set the RemotingFormat but the exception still occurs after a certain amount of requests. – Anton Sterr Feb 26 '18 at 10:22
  • Then if I were to hazard a guess, it's that you are creating too many objects on the [large object heap](https://stackoverflow.com/q/8951836), eventually causing memory fragmentation and out-of-memory errors. The large `byte[] data` array is an obvious candidate for this. Is there any way you can stream the `DataSet` in directly rather than loading into an intermediate byte array? If not take Mark Gravell's suggestion copy into some sort of temp `Stream`. – dbc Feb 26 '18 at 12:18

1 Answers1

3

Before serializing, set:

yourDataSet.RemotingFormat = SerializationFormat.Binary;

That should help a lot. The default even when using BinaryFormatter is xml.

Note, however, that DataSet and DataTable are inherently not great candidates for optimization. There are a lot of great serialization tools that will do a much better job of packing your data, but they invariable require strong type models, i.e. List<SomeSpecificType> where SomeSpecificType is a POCO/DTO class. Even WCF only barely tolerates DataTable/DataSet. So if you can get rid of your dependency on DataTable/DataSet: I strongly advise doing so.

Another option is to send the data as a Stream. I'm pretty sure WCF supports this natively, but this would in theory allow you to have a different Stream (not MemoryStream) that is actually much larger. As a cheap option you could use a temporary file as a scratch area, but if that works you could investigate a custom in-memory stream that stitches multiple buffers together.

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • *"Note, however, that DataSet and DataTable are inherently not great candidates for optimization."* Heed this! Use your own simple class instead. – BernieP Nov 14 '18 at 16:22