9

I read very often that the BinaryFormatter has better performance then XmlSerializer. Out of curiosity, I wrote a test-app.

a wtf moment... why is Xml so much faster than Bin (especially the deserialization)?

using System;
using System.Collections.Generic;
using System.Runtime.Serialization;
using System.Xml.Serialization;
using System.Runtime.Serialization.Formatters.Binary;
using System.IO;

namespace SerPlayground
{
    class Program
    {
        static void Main(string[] args)
        {
            var items = new List<TestClass>();
            for (int i = 0; i < 1E6; i++)
            {
                items.Add(new TestClass() { Name = i.ToString(), Id = i });
            }

            File.Delete("test.bin");
            using (var target = new FileStream("test.bin", FileMode.OpenOrCreate))
            {
                System.Threading.Thread.Sleep(1000);
                var bin = new BinaryFormatter();
                var start = DateTime.Now;
                bin.Serialize(target, items);
                Console.WriteLine("Bin: {0}", (DateTime.Now - start).TotalMilliseconds);

                target.Position = 0;
                System.Threading.Thread.Sleep(1000);
                start = DateTime.Now;
                bin.Deserialize(target);
                Console.WriteLine("Bin-D: {0}", (DateTime.Now - start).TotalMilliseconds);
            }

            File.Delete("test.xml");
            using (var target = new FileStream("test.xml", FileMode.OpenOrCreate))
            {
                System.Threading.Thread.Sleep(1000);
                var xml = new XmlSerializer(typeof(List<TestClass>));
                var start = DateTime.Now;
                xml.Serialize(target, items);
                Console.WriteLine("Xml: {0}", (DateTime.Now - start).TotalMilliseconds);

                target.Position = 0;
                System.Threading.Thread.Sleep(1000);
                start = DateTime.Now;
                xml.Deserialize(target);
                Console.WriteLine("Xml-D: {0}", (DateTime.Now - start).TotalMilliseconds);
           }

            Console.ReadKey();
        }
    }

    [Serializable]
    public class TestClass
    {
        public string Name { get; set; }
        public int Id { get; set; }
    }
}

my results:

Bin: 13472.7706
Bin-D: 121131.9284
Xml: 8917.51
Xml-D: 12841.7345
Lukas
  • 91
  • 1
  • 1
  • 4
  • 1
    Deserialization is far slower than serialization. Can you modify your sample to do both? That will be a much more interesting comparison. – Kirk Woll Aug 22 '10 at 21:37
  • Hi, I would consider running some of your tests multiple (100+?) times - either with the startup (creating the formatters/serializers) included in the loop or without. Timing the result of many more runs might give you a more accurate picture of the performance. Also consider using the StopWatch class to do the timing too as I believe it uses the high-performance timer where poss. If you still get a faster XML serialization it would be good to know why as well! – Jennifer Aug 22 '10 at 21:38
  • I've always understood that binary is "faster" with respect to network transfer, as it will contain fewer bytes. I would expect it would take a pretty hefty payload for the difference to be evident, though. – kbrimington Aug 22 '10 at 21:38
  • @Jennifer: I tried the Stopwatch class, and the results were the same. – Robert Harvey Aug 22 '10 at 21:54

3 Answers3

7

Because you are serialising an object that doesn't have any properties.

If you serialise something different that actually contains some data, like for example a string, the binary serialiser is a lot faster than the XML serialiser.

I did this change to your code:

items.Add("asfd");

and I get this result:

Xml: 1219.0541
Bin: 165.0002

A part of the difference is of course that the XML file is about ten times larger than the binary file.

Robert Harvey
  • 178,213
  • 47
  • 333
  • 501
Guffa
  • 687,336
  • 108
  • 737
  • 1,005
  • 1
    Worth noting is that the whole thing runs almost twice as fast if you swap the position of the XML and BIN blocks (although the proportion of time used between the blocks stays about the same). So I believe the whole benchmark is suspect, perhaps due to garbage collection or some other factor. – Robert Harvey Aug 22 '10 at 21:51
  • 2
    asfd - American Society of Furniture Designers? – Russ Cam Aug 22 '10 at 21:51
  • & Rober Harvey : I've added a more complex class... swapping of XML and BIN doesn't make any changes now... result is the same XML is faster – Lukas Aug 22 '10 at 22:26
  • @Lukas, you're still only serializing. You really ought to deserialize too. – Kirk Woll Aug 22 '10 at 22:55
  • @Kirk Woll: now i deserialize too – Lukas Aug 22 '10 at 23:10
4

The example is pretty good and the question is interesting (I agree with Robert that you should run the Main method itself at least once before doing any measurements as initialization of variuos sorts shouldn't be considered part of the test.)

That being said, one key difference between XmlSerializer and BinaryFormatter (aside from the obvious) is that XmlSerializer makes no attempt to keep track of references. If your object graph has multiple references to the same object, you get multiple copies in the XML and this does not get resolved properly (back into a single object) upon deserialization. Worse if you have cycles, the object can't be serialized at all. Contrast this with BinaryFormatter which does keep track of references and reliably reconstructs the object graph no matter how many, and what sort of, object references you may have. Perhaps the overhead of this facility accounts for the poorer performance?

The main reason to use BinaryFormatter over XmlSerializer is the size of the output, not the performance of the serialziation/deserialization. (The overhead of constructing text is not so great, it's the transporting of that xml text that is expensive.)

Kirk Woll
  • 76,112
  • 22
  • 180
  • 195
1

Also see What are the differences between the XmlSerializer and BinaryFormatter

Community
  • 1
  • 1
H.Wolper
  • 699
  • 1
  • 13
  • 26