Deserialize xml into a class with different hierarchy?

Question

This will deserialize an xml sample into the "XmlModel" class.

using System.Collections.Generic;
using System.IO;
using System.Xml.Serialization;

namespace XmlTest
{
    public class DeserializeXml
    {
        public XmlModel GetXmlModel()
        {
            string xml = @"<?xml version=""1.0"" encoding=""utf-16""?>
            <root>
                <foo>
                    <bar>1</bar>
                    <bar>2</bar>
                </foo>
            </root>";

            var dS = new XmlSerializer(typeof(XmlModel));

            var m = new XmlModel();
            using (var reader = new StringReader(xml))
            {
                return (XmlModel) dS.Deserialize(reader);
            }
        }
    }

    [XmlRoot("root")]
    public class XmlModel
    {
        [XmlArray("foo")]
        [XmlArrayItem("bar")]
        public List<string> Foo { get; set; }
    }
}

This will get the model:

var d = new DeserializeXml();
result = d.GetXmlModel();

I am working with legacy code and I cannot make changes to the XmlModel class other than changing the XmlAttributes. Here is the problem: the actual Xml has no "foo" node:

string xml = @"<?xml version=""1.0"" encoding=""utf-16""?>
<root>
    <bar>1</bar>
    <bar>2</bar>
</root>";

So now I am stuck with the task of making the deserializer swallow this xml and output type XmlModel. Is this possible without Xslt preprocessing or other more complicated methods?

You could deserialize to a format that matches the XML, then use something like [AutoMapper](http://automapper.org/) to "upgrade" it. Not super performant, but I don't know what your requirements are in that area, it might be acceptable. — Bradley Uffner, Oct 09 '17 at 19:41
What other restrictions are you under? Can you use a different method of deserialization? — Bradley Uffner, Oct 09 '17 at 19:55
@Bradley Uffner Using a proxy class + automapper is an option - but would involve some effort + a performance penalty, I think (more so than an XSLT prrprocessor). An alternative serializer is an option I have not thought off and could be a solution, if mere XmlAttribute tricks won't work (I was hoping they would). — TvdH, Oct 09 '17 at 20:01
There may be some built in way to do this with the `XmlSerializer` and `Attributes`. That's an area of .NET I haven't really explored much, so I'm not sure. I'm just trying to find what options are available for you. — Bradley Uffner, Oct 09 '17 at 20:03
The legacy code may be set in stone but the Xml is certainly not. Have you considered temporarily modifying the Xml (adding the Foo root node) prior to attempting to deserialize? Otherwise, it looks to me like you will need a custom deserializer. — JuanR, Oct 09 '17 at 20:10
@Juan Yes, manipulating the XML is an option. I wanted to check if there is something homecooked with XmlAttributes available. — TvdH, Oct 09 '17 at 20:48
@TvdH: Unlikely. The `XmlSerializer` is a basic implementation for simple scenarios. From what I have seen, anything beyond exact matching requires a custom serializer. — JuanR, Oct 09 '17 at 20:53

Bradley Uffner · Answer 1 · 2017-10-09T20:18:54.433

If you are open to an alternate method of deserialization, this will work. It should be just as fast, if not faster, than the XmlSerializer. It simply opens an XmlReader on the raw xml, moves to the first "data" element, dumps the data in to a list, then populates and returns your XmlModel from it.

LINQPad file available here.

public XmlModel GetXmlModel()
{
    string xml = @"<?xml version=""1.0"" encoding=""utf-16""?>
        <root>
                <bar>1</bar>
                <bar>2</bar>
        </root>";
    using (var reader = XmlReader.Create(new StringReader(xml)))
    {
        reader.MoveToContent();
        var data = new List<string>();
        while (reader.Read())
        {
            if (reader.NodeType == XmlNodeType.Element)
            {
                var element = XNode.ReadFrom(reader) as XElement;
                switch (element.Name.LocalName)
                {
                    case "bar":
                        {
                            data.Add(element.Value);
                            break;
                        }
                }
            }
        }
        return new XmlModel() { Foo = data };
    }
}

This obviously gets more complex if your bar class is more than a simple intrinsic type, like string.

Your answer looks as though it was adapted from http://msdn.microsoft.com/en-us/library/system.xml.linq.xnode.readfrom.aspx, but unfortunately the MSDN code has a bug -- it skips elements when the XML is not indented. See [this answer](https://stackoverflow.com/a/18282052/3744182) for an analysis. Personally I'd suggest just loading into an `XElement` and doing everything in memory, precisely because working with `XmlReader` directly is so fussy. — dbc, Oct 09 '17 at 22:44
I'll admit, I did glance at that example for reference. Your suggestion would be my preferred way too, but I was trying to keep things as fast as possible, since op mentioned performance in the discussion. — Bradley Uffner, Oct 10 '17 at 00:09

score 1 · Accepted Answer · answered Oct 09 '17 at 23:25

You can use XmlAttributeOverrides to specify alternate XML attributes for your XmlModel, then construct an XmlSerializer using those attributes by doing:

var serializer = new XmlSerializer(typeof(XmlModel), overrides).

However, note the following warning from the documentation:

To increase performance, the XML serialization infrastructure dynamically generates assemblies to serialize and deserialize specified types. The infrastructure finds and reuses those assemblies. This behavior occurs only when using the following constructors:

XmlSerializer.XmlSerializer(Type)

XmlSerializer.XmlSerializer(Type, String)

If you use any of the other constructors, multiple versions of the same assembly are generated and never unloaded, which results in a memory leak and poor performance. The easiest solution is to use one of the previously mentioned two constructors. Otherwise, you must cache the assemblies in a Hashtable...

The following static class creates and caches 2 serializers, one for the "current" version of XmlModel, and one for the "alternate" version in which the <bar> elements lack an outer container element:

public static class XmlModelSerializer<TRoot>
{
    static XmlSerializer alternateSerializerInstance;
    static XmlSerializer currentSerializerInstance;

    public static XmlSerializer AlternateSerializerInstance { get { return alternateSerializerInstance; } }

    public static XmlSerializer CurrentSerializerInstance { get { return currentSerializerInstance; } }

    static XmlModelSerializer()
    {
        XmlAttributes alternateAttributes = new XmlAttributes
        {
            XmlElements = { new XmlElementAttribute("bar") },
        };
        XmlAttributeOverrides alternateOverrides = new XmlAttributeOverrides();
        alternateOverrides.Add(typeof(XmlModel), "Foo", alternateAttributes);
        alternateSerializerInstance = new XmlSerializer(typeof(TRoot), alternateOverrides);

        XmlAttributes currentAttributes = new XmlAttributes
        {
            XmlArray = new XmlArrayAttribute("foo"),
            XmlArrayItems = { new XmlArrayItemAttribute("bar") },
        };
        XmlAttributeOverrides currentOverrides = new XmlAttributeOverrides();
        currentOverrides.Add(typeof(XmlModel), "Foo", currentAttributes);
        currentSerializerInstance = new XmlSerializer(typeof(TRoot), currentOverrides);
    }
}

By using two different serializers, one for each possible XML format, you can avoid making any changes at all to your legacy XmlModel type.

Then, to deserialize flattened XML of the form

<root>
    <bar>1</bar>
    <bar>2</bar>
</root>

You would simply do:

var dS = XmlModelSerializer<XmlModel>.AlternateSerializerInstance;
using (var reader = new StringReader(xml))
{
    return (XmlModel) dS.Deserialize(reader);
}

Sample fiddle showing deserialization in both formats.

Not only does this demonstrate that using [XmlElement("bar")] above public List Foo { get; set; } will omit the "foo" node, it also shows how to dynamically switch between the two deserializations - grand. — TvdH, Oct 10 '17 at 07:01

Deserialize xml into a class with different hierarchy?

2 Answers2