2

I have a type which i need to do the following

  1. zip, serialize and write to a file. This could happen multiple times.
  2. I should be able to recreate a list of this type of object i.e. de-serialize and store in a list of collection.

I have tried few solution like given below but they are slow. Need a fast solution using .Net 4.0.

Soln:

  1. Create a zip stream, use BinaryFormatter and then using StreamWriter write a line in the file and close it.

Same reverse way for creating the list. Reading a line unzip and then de-serialize one object at a time.

John Koerner
  • 37,428
  • 8
  • 84
  • 134
  • 2
    Sounds you did it the right way. Buy a faste computer. :) – Martin Mulder May 01 '13 at 17:04
  • Do you really need compression (zip)? That's going to eat into your performance quite a bit. – Scott Jones May 01 '13 at 17:13
  • Binaryformatter is NOT fast. You have many and/or large objects, it will not be nearly as fast as e.g. writing the objects manually using a binarywriter. – Anders Forsgren May 01 '13 at 17:43
  • @Anders Forsgren : My type is a very complex type containing sub list of more similar sub types. So it seems to not a good option thanks for the suggestion – Abhishek Dutta May 01 '13 at 18:21
  • @ Scott Jones: I need to GZip it since the file size grows over the time some times even the GZip version comes to 80 -90 MB. So i need to do the compression – Abhishek Dutta May 01 '13 at 18:23
  • I haven't tried this, but here's a supposedly fast serialization option: http://www.codeproject.com/Articles/14164/A-Fast-Serialization-Technique. Apparently, the binary formatter jams a ton of metadata about your objects and properties into the byte array so it know what to serialize them as. The way in that link eliminates the need for it. – valverij May 01 '13 at 21:06

1 Answers1

1

From what I understand, the BinaryFormatter + StreamWriter combo can become pretty slow and bloated because it adds metadata to the byte array about the object or file, properties, and datatypes.

One option you have, if you are willing to work with a third party library, is Protocol Buffers. According to the site, it is lightweight, fast serialization format that Google uses in their data communications. It's also recommended in this StackOverflow question: Fast and compact object serialization in .NET.

There are two libraries are available for .NET:

Here is a table of results comparing "protobuf-net" (first link) and "proto#" (second link) to other serialization techniques (more tests available here):

Serializer                  size    serialize    deserialize
-------------------------------------------------------------
protobuf-net                 3         268         1,881
proto#                       3         76          1,792
BinaryFormatter             153       6,694        8,420
SoapFormatter               687       28,609       55,125
XmlSerializer               153       14,594       19,819
DataContractSerializer      205       3,263        10,516
DataContractJsonSerializer  26        2,854        15,621

If you would prefer to have a little more control over it, though, (and if you are just serializing objects), then this link from Code Project contains a neat pattern for serializing them: http://www.codeproject.com/Articles/14164/A-Fast-Serialization-Technique

The idea is that you implement the ISerializable interface for whatever class you need to serialize. This forces you to add a the ISerializable.GetObjectData method, which provides a SerializationWriter that you use to write each property individually, which you then add to a SerializationInfo object. The syntax itself for this is actually incredibly straightforward.

Here is a quick, abbreviated, sample of the GetObjectData method from the site:

// Serialize the object. Write each field to the SerializationWriter
// then add this to the SerializationInfo parameter

public void GetObjectData (SerializationInfo info, StreamingContext ctxt) {
    SerializationWriter sw = SerializationWriter.GetWriter ();
    sw.Write (id1);
    sw.Write (id2);
    sw.Write (id3);
    sw.Write (s1);
    sw.Write (s2); 

    // more properties here         

    sw.AddToInfo (info);
}

Here are the results of this author's tests:

                         Formatter      Size (bytes)     Time (uS)
--------------------------------------------------------------------
Standard serialization    Binary           2080           364
Fast serialization        Binary           421            74
Fast serialization        SOAP             1086           308
Community
  • 1
  • 1
valverij
  • 4,871
  • 1
  • 22
  • 35