9

I use a XmlSerializer to serialize/deserialize some objects. The problem is the performance. When profiling, using the XmlSerializer make our application 2 seconds longer to start. We cache our XmlSerializer and reuse them. We cannot use sgen.exe because we are creating the XmlSerializer with XmlAttributeOverrides.

I try to use serialization alternative like Json.Net and, at first, it's work great. The problem is that we need to be backward compatible so all the xml already generated need to be parsed correctly. Also, the object serialization output must be Xml.

To summarize:

  1. I receive Xml data serialized by a XmlSerializer.
  2. I need to deserialize the Xml data and convert it into an object.
  3. I need to serialize object into Xml (ideally an Xml format like the one a XmlSerializer would have done)
Melursus
  • 10,328
  • 19
  • 69
  • 103
  • Your XML data contains opening tag `` and closing tag ``. I suppose it's a typing error. Moreover in your question you defines the format of input data, but not clear defines the format of the JSON output. Exactly like you can represent the same information in different XML format you can produce different JSON data with the equivalent information set, but different formats. I think you should define more clear the format of the output data. – Oleg Mar 05 '12 at 16:32
  • It would be good if you also clear the restriction "I can't use a XmlSerializer." If the reason is only the performance, then there are many ways to improve the performance, using sgen.exe or by implementing of `ISerializable` interface. What is the most unclear in the question: why you have so strange input format if the data. Do you have one long XML file or you have a lot of such files? Typically one have the original information in the database. So why you need so strange XML input instead of accessing to the *original* data? – Oleg Mar 05 '12 at 16:44
  • I update my question to more represent by problematic – Melursus Mar 05 '12 at 18:02
  • 1
    Is this only a startup time issue with the XmlSerializer? As like Oleg says, the serialization/deserialization time can be quite fast (it's compiled) if used properly, for example if the good constructors are called, etc. – Simon Mourier Mar 05 '12 at 18:16
  • Yes, it's only at startup because at startup we are deserializing our objects. What take time is to build the XmlSerializer. After, it's pretty fast but we need to improve our startup time. – Melursus Mar 05 '12 at 18:24
  • Could you deserialize on a background thread at startup so it would seem faster for your users. – Phil Mar 09 '12 at 22:45
  • You can also use both json & xml side by side. if json-file exists load it, else load xml-file. of course, you have to save both xml & json when some updates are needed. – L.B Mar 10 '12 at 16:16
  • @Oleg `ISerializable` is unrelated to the question – Marc Gravell Mar 12 '12 at 08:34
  • @MarcGravell: Thanks! It was typing error. I mean `IXmlSerializable` of cause. – Oleg Mar 12 '12 at 08:39

7 Answers7

13

Ultimately, it depends on the complexity of your model. XmlSerializer needs to do a lot of thinking, and the fact that it is taking so long leads me to suspect that your model is pretty complex. For a simple model, it might be possible to manually implement the deserialization using LINQ-to-XML (pretty easy), or maybe even XmlReader (if you are feeling very brave - it isn't easy to get 100% correct).

However, if the model is complex, this is a problem and frankly would be very risky in terms of introducing subtle bugs.

Another option is DataContractSerializer which handles xml, but not as well as XmlSerializer, and certainly not with as much control over the layout. I strongly suspect that DataContractSerializer would not help you.

There is no direct replacement for XmlSerializer that I am aware of, and if sgen.exe isn't an option I believe you basically have options:

  • live with it
  • rewrite XmlSerializer yourself, somehow doing better than them
  • use something like LINQ-to-XML and accept the effort involved

Long term, I'd say "switch formats", and use xml for legacy import only. I happen to know of some very fast binary protocols that would be pretty easy to substitute in ;p

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • Excellent answer. I had to do something similar in a project where I had to use interfaces and not use abstract classes (since C# doesn't do multiple inheritance). Using linq to xml and reflection did the trick. My next task is making Attributes instead of a list of rules in parameters (just learned about that). I must say it's a lot of fun. – pqsk Sep 13 '12 at 16:25
2

This answer has some good info on why XmlSerializer runs slow with XmlAttributeOverrides.

Do you really need to use the XmlSerializer in your main thread at startup?

Maybe run it in a background thread; If only some parts of the data are mandatory for startup perhaps you could manually read them into proxy/sparse versions of the real classes while the XmlSerializer initializes.

If its a GUI app you could just add a splash screen to hide the delay (or a game of tetris!!)

If all else fails can't you convert the existing files to JSON by just running the existing deserialize and a JSON serialize, or is there a hard requirement to keep them XML?

Community
  • 1
  • 1
Peter Wishart
  • 11,600
  • 1
  • 26
  • 45
2

The problem is that you are requesting types which are not covered by sgen which results in the generation of new assemblies during startup.

You could try to get your hands on the temp files generated by Xmlserializer for your specific types and use this code for your own pregnerated xmlserializer assembly. I have used this approach to find out why csc.exe was executed which did delay startup of my application.

Besides this it might help to rename some types like in the article to arrive at the same type names that sgen is creating to be able to use sgen. Usually type arrays are not precreated by sgen which is a pitty sometimes. But if you name your class ArrayOf HereGoesYourTypeName then you will be able to use the pregenerated assemblies.

Alois Kraus
  • 13,229
  • 1
  • 38
  • 64
0

Another option I don't see mentioned in any answer here is to compile a serialization assembly. This way all the code generation and code compilation steps happen during compile-time in Visual Studio, not at runtime when your app is starting.

The OP mentions that the app takes too long to start. Well, that's exactly what serialization assembly is for.

In .NET Core the steps are pretty easy:

  1. Add nuget reference to Microsoft.XmlSerializer.Generator
  2. ...that's it :)

More info here https://learn.microsoft.com/en-us/dotnet/core/additional-tools/xml-serializer-generator

P.S. If you're still using .NET Framework (not .NET Core), see this question Generating an Xml Serialization assembly as part of my build

Alex from Jitbit
  • 53,710
  • 19
  • 160
  • 149
0

you have to Deserialize your list using a classic .net Serialization

something like the below:

TextReader tr = new StreamReader("yourlist.xml"); 
XmlSerializer serializer = new XmlSerializer(typeof(List<YourObject>));
List<YourObject> myCutomList = (List<YourObject>)serializer.Deserialize(tr); 
tr.Close(); 

then you can use the Json.Serialization

JavaScriptSerializer json = new JavaScriptSerializer();
JsonResult output = json.Serialize(myCutomList );
Massimiliano Peluso
  • 26,379
  • 6
  • 61
  • 70
  • I can't use the .net XmlSerializer because in our project he has poor performance. That's why I try to use the JsonSerializer but I need to be compatible with our old format so I need to handle Xml serialized by XmlSerializer without using a XmlSerializer... – Melursus Feb 29 '12 at 16:06
0

If you have xml stored in the format that you cannot use then use xslt to transform it to the format that you can use.

If this xml is stored in the XmlSerializer format - lets say in flat files - or a db - then you can run your transforms over it once and not incur the XmlSerializer overhead at your usual runtime.

Alternatively you could xslt it at run time - but I doubt that this would be faster than the method outlined by Massimiliano.

dice
  • 2,820
  • 1
  • 23
  • 34
0

You can use threading or tasks to get the application to startup faster and don't wait for the hardrive or the deserialization.

NPehrsson
  • 1,548
  • 18
  • 26