How to do a polymorphic deserialization in C# given a XSD?

Question

I have the following given:

1) A XML Schema, XSD-file, compiled to C# classes using the XSD.EXE tool.

2) A RabbitMQ message queue containing well formed messages in XML of any type defined in the XML Schema. Here are two snippets of different messages:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<UserReport xmlns=".../v5.1"; ... >
    ... User report message content... 
</UserReport>

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<CaptureReport xmlns=".../v5.1"; ...>
    ... Capture report message content... 
</CaptureReport>

3) Experience using the XmlSerializer .Net class to deserialize, when the type is known.

The question is how to deserialize messages from XML to a an object, when the type is unknown. It's not possible to instantiate the XmlSerializer, because the type is unknown.

One way is to loop through all possible types until deserialiation succeed, which is a bad solution because there are many different types defined in the XML Schema.

Is there any other alternatives?

softwariness · Accepted Answer · 2015-01-27T13:13:52.503

There are a few approaches you can take depending on how exactly you've achieved your polymorphism in the XML itself.

Element name is the type name (reflection approach)

You could get the root element name like this:

string rootElement = null;

using (XmlReader reader = XmlReader.Create(xmlFileName))
{
    while (reader.Read())
    {
        // We won't have to read much of the file to find the root element as it will be the first one found
        if (reader.NodeType == XmlNodeType.Element)
        {
            rootElement = reader.Name;
            break;
        }
    }
}

Then you could find the type by reflection like this (adjust reflection as necessary if your classes are in a different assembly):

var serializableType = Type.GetType("MyApp." + rootElement);
var serializer = new XmlSerializer(serializableType);

You would be advised to cache the mapping from the element name to the XML serializer if performance is important.

Element name maps to the type name

If the XML element names are different from the type names, or you don't want to do reflection, you could instead create a Dictionary mapping from the element names in the XML to the XmlSerializer objects, but still look-up the root element name using the snippet above.

Common root element with polymorphism through xsi:type

If your XML messages all have the same root element name, and the polymorphism is achieved by having types identified using xsi:type, then you can do something like this:

using System;
using System.Xml.Serialization;

namespace XmlTest
{
    public abstract class RootElement
    {
    }

    public class TypeA : RootElement
    {
        public string AData
        {
            get;
            set;
        }
    }

    public class TypeB : RootElement
    {
        public int BData
        {
            get;
            set;
        }
    }

    class Program
    {
        static void Main(string[] args)
        {
            var serializer = new System.Xml.Serialization.XmlSerializer(typeof(RootElement),
                new Type[]
                {
                    typeof(TypeA),
                    typeof(TypeB)
                });
            RootElement rootElement = null;
            string axml = "<RootElement xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xsi:type=\"TypeA\"><AData>Hello A</AData></RootElement>";
            string bxml = "<RootElement xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xsi:type=\"TypeB\"><BData>1234</BData></RootElement>";

            foreach (var s in new string[] { axml, bxml })
            {
                using (var reader = new System.IO.StringReader(s))
                {
                    rootElement = (RootElement)serializer.Deserialize(reader);
                }

                TypeA a = rootElement as TypeA;

                if (a != null)
                {
                    Console.WriteLine("TypeA: {0}", a.AData);
                }
                else
                {
                    TypeB b = rootElement as TypeB;

                    if (b != null)
                    {
                        Console.WriteLine("TypeB: {0}", b.BData);
                    }
                    else
                    {
                        Console.Error.WriteLine("Unexpected type.");
                    }
                }
            }
        }
    }
}

Note the second parameter to the XmlSerializer constructor which is an array of additional types that you want the .NET serializer to know about.

I understand your answer. Unfortunately I have not pointed out in the question, that the XML Schema is given and don't have a root. This means that the actual type must be specified for XmlSerializer in order to be deserialize, but the type is unknown. So I hope you understand the question now. — Jan Rou, Jan 25 '15 at 18:04
Can you describe the features of your XML snippets that would enable the question to be answered? From your comment, I think you're saying that each XML snippet has a different root element name - is that correct? If so, does the root element name uniquely determine the type of the XML snippet? Or alternatively, is the root element arbitrary but type still has to be determined using `xsi:type`? Or is there some other means by which the type is identifiable? — softwariness, Jan 25 '15 at 18:11
I have updated my answer to elaborate on approaches that suit your specific case. I have left the `xsi:type` example as it also a solution to the general problem in the question title and may be of use to others. — softwariness, Jan 27 '15 at 13:16
Thank you for help! I accept the first part of your answer, where you use the XmlReader to grab the rootElement name. I've tested it and it works. Allhough I use XmlDocument and find the name with property `DocumentElement.Name. In the reflection I user the namespace, which I have defined for the compilation of the XML Schema with xsd.exe tool. — Jan Rou, Jan 28 '15 at 10:20
Note that if you construct an `XmlSerializer` with extra types passed in in runtime, you **must** cache the serializer statically for later reuse or you will have a severe memory leak. For why, see [Memory Leak using StreamReader and XmlSerializer](https://stackoverflow.com/q/23897145/3744182). — dbc, May 19 '21 at 04:54

Alex Zhukovskiy · Answer 2 · 2018-04-18T14:08:31.907

This is an answer based on @softwariness one, but it provides some automation.

If your classes are generated via xsd then all root types are decorated with XmlRootAttribute so we can use it:

public class XmlVariantFactory
{
    private readonly Dictionary<string, Type> _xmlRoots;

    public XmlVariantFactory() : this(Assembly.GetExecutingAssembly().GetTypes())
    {
    }

    public XmlVariantFactory(IEnumerable<Type> types)
    {
        _xmlRoots = types
                    .Select(t => new {t, t.GetCustomAttribute<XmlRootAttribute>()?.ElementName})
                    .Where(x => !string.IsNullOrEmpty(x.ElementName))
                    .ToDictionary(x => x.ElementName, x => x.t);
    }

    public Type GetSerializationType(XmlReader reader)
    {
        while (reader.Read())
        {
            if (reader.NodeType == XmlNodeType.Element)
            {
                return _xmlRoots[reader.LocalName];
            }
        }
        throw new ArgumentException("No known root type found for passed XML");
    }
}

It scans all type in executing assembly and finds all possible XML roots. You can do it for all assemblies:

public XmlVariantFactory() : this(AppDomain.CurrentDomain.SelectMany(a => a.GetTypes())
{
}

And then you juse use it:

var input = new StringReader(TestResource.firstRequestResponse);
var serializationType = new XmlVariantFactory().GetSerializationType(XmlReader.Create(input));
var xmlReader = XmlReader.Create(input);
bool canDeserialize = new XmlSerializer(serializationType).CanDeserialize(xmlReader);
Assert.True(canDeserialize);

Nice variation of the first answer. – Jan Rou Apr 19 '18 at 16:57 — Jan Rou, Apr 19 '18 at 16:57

Jan Rou · Answer 3 · 2018-04-20T05:22:15.567

The responsible for the XML Schema have added the xml-tag to a content field in the RabbitMQ protocol header. The header holds the tag for the dto, data transfer object, sent and serialized to xml. This means that a IOC container becomes handy. I have coded a dto builder interface and its implementation by a generic builder. Thus the builder will build a dto when the dto class is specified for generic part. Note that the dto-class is generated by the xsd-tool. In a IOC container like MS Unity I registered the builder interface implementations for dto all classes and added the xml-tag to the register call. The IOC container’s resolver function is called with the actual received xml-tag from the RabbitMQ header in order to instantiate the specific builder of the dto.

How to do a polymorphic deserialization in C# given a XSD?

3 Answers3

Element name is the type name (reflection approach)

Element name maps to the type name

Common root element with polymorphism through xsi:type

Linked