3

I recently registered event handlers for unknown nodes, elements and attributes with the XMLSerializer I use to deserialize complex types from a type hierarchy. I did this because some of the XML I receive is from third parties; I am interested in data format changes which could cause trouble on my end.

In the XML the XMLSerializer produces it uses the standard XML attribute xsi:type="somederivedtypename" to identify the actual derived type represented by an XML element.

I was surprised to see that the same serializer treats that very same attribute it just produced as unknown upon deserialization. Interestingly though, the deserialization is correct and complete (also with more complicated types and data in my real-world program). That means that the serializer evaluates the type information properly during an early stage in the deserialization. But during a later data-extraction stage the attribute is apparently mistaken for a true data part of the object, which is of course unknown.

In my application the gratuitous warnings end up cluttering a general purpose log file which is undesired. In my opinion the serializer should read back the XML it produced without hiccups. My questions:

  • Am I doing something wrong?
  • Is there a workaround?

A minimal example is here:

using System;
using System.IO;
using System.Xml.Serialization;

namespace XsiTypeAnomaly
{
    /// <summary>
    /// A trivial base type.
    /// </summary>
    [XmlInclude(typeof(DerivedT))]
    public class BaseT{}

    /// <summary>
    /// A trivial derived type to demonstrate a serialization issue.
    /// </summary>
    public class DerivedT : BaseT
    {
        public int anInt { get; set; }
    }

    class Program
    {
        private static void serializer_UnknownAttribute
            (   object sender, 
                XmlAttributeEventArgs e )
        {
            Console.Error.WriteLine("Warning: Deserializing " 
                    + e.ObjectBeingDeserialized
                    + ": Unknown attribute "
                    + e.Attr.Name);
                }

        private static void serializer_UnknownNode(object sender, XmlNodeEventArgs e)
        {
            Console.Error.WriteLine("Warning: Deserializing "
                    + e.ObjectBeingDeserialized
                    + ": Unknown node "
                    + e.Name);
        }

        private static void serializer_UnknownElement(object sender, XmlElementEventArgs e)
        {
            Console.Error.WriteLine("Warning: Deserializing "
                    + e.ObjectBeingDeserialized
                    + ": Unknown element "
                    + e.Element.Name);
        }

        /// <summary>
        /// Serialize, display the xml, and deserialize a trivial object.
        /// </summary>
        /// <param name="args"></param>
        static void Main(string[] args)
        {
            BaseT aTypeObj = new DerivedT() { anInt = 1 };
            using (MemoryStream stream = new MemoryStream())
            {
                var serializer = new XmlSerializer(typeof(BaseT));

                // register event handlers for unknown XML bits
                serializer.UnknownAttribute += serializer_UnknownAttribute;
                serializer.UnknownElement += serializer_UnknownElement;
                serializer.UnknownNode += serializer_UnknownNode;

                serializer.Serialize(stream, aTypeObj);
                stream.Flush();

                // output the xml
                stream.Position = 0;
                Console.Write((new StreamReader(stream)).ReadToEnd() + Environment.NewLine);
                stream.Position = 0;
                var serResult = serializer.Deserialize(stream) as DerivedT;

                Console.WriteLine(
                        (serResult.anInt == 1 ? "Successfully " : "Unsuccessfully ")
                    + "read back object");
            }
        }
    }
}

Output:

<?xml version="1.0"?>
<BaseT xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xsi:type="DerivedT">
  <anInt>1</anInt>
</BaseT>
Warning: Deserializing XsiTypeAnomaly.DerivedT: Unknown node xsi:type
Warning: Deserializing XsiTypeAnomaly.DerivedT: Unknown attribute xsi:type
Successfully read back object
Peter - Reinstate Monica
  • 15,048
  • 4
  • 37
  • 62
  • The warning seems to make sense because you create a serializer for `BaseT` and then actually feeds in a `DerivedT` object. If you just create a serializer for `DerivedT`, the warning goes away. – jsanalytics Feb 22 '17 at 15:01
  • 1
    @jstreet But the whole point of this attribute is to enable the base class serializer to deserialize derived objects. Imagine a list of base, which can hold any derived type. The calling code doesn't know and doesn't care which actual derived types are held in the list. The list handling code was actually written before many of the derived types existed. – Peter - Reinstate Monica Feb 22 '17 at 15:11
  • 2
    I see your point, and the serializer is able to deserialize the derived object, with the "incovenience" of the warning, because it actually doesn't know property `anInt`. Suggestion: when you create your serializer use `aTypeObj.GetType()` instead of using any explicit type, either base or derived. – jsanalytics Feb 22 '17 at 15:17
  • 1
    @jstreet The warning is not about `anInt`; that element is serialized and deserialized properly (as can be seen when I test the non-default value after deserialization). The warning is, as it says, about the attribute `xsi:type`. The attributes *could* syntactically carry object information (i.e., I could have serialized `anInt` as an attribute!), but are used by the serializer to store meta information about the type instead. The attributes `xmlns:xsi` and `xmlns:xsd` are correctly identified as "not part of the object data", but `xsi:type` is not, for some reason. I believe it's a bug. – Peter - Reinstate Monica Feb 22 '17 at 16:45

3 Answers3

6

Am I doing something wrong?

I don't think so. I share your opinion that XmlSerializer ought to deserialize its own output without any warnings. Also, xsi:type is a standard attribute defined in the XML Schema specification, and obviously it is supported by XmlSerializer, as demonstrated by your example and documented in MSDN Library.

Therefore, this behavior simply looks like an oversight. I can imagine a group of Microsoft developers working on different aspects of XmlSerializer during the development of the .NET Framework, and not ever testing xsi:type and events at the same time.

That means that the serializer evaluates the type information properly during an early stage in the deserialization. But during a later data-extraction stage the attribute is apparently mistaken for a true data part of the object, which is of course unknown.

Your observation is correct.

The XmlSerializer class generates a dynamic assembly to serialize and deserialize objects. In your example, the generated method that deserializes instances of DerivedT looks something like this:

private DerivedT Read2_DerivedT(bool isNullable, bool checkType)
{
    // [Code that uses isNullable and checkType omitted...]

    DerivedT derivedT = new DerivedT();
    while (this.Reader.MoveToNextAttribute())
    {
        if (!this.IsXmlnsAttribute(this.Reader.Name))
            this.UnknownNode(derivedT);
    }

    this.Reader.MoveToElement();
    // [Code that reads child elements and populates derivedT.anInt omitted...]
    return derivedT;
}

The deserializer calls this method after it reads the xsi:type attribute and decides to create an instance of DerivedT. As you can see, the while loop raises the UnknownNode event for all attributes except xmlns attributes. That's why you get the UnknownNode (and UnknownAttribute) event for xsi:type.

The while loop is generated by the internal XmlSerializationReaderILGen.WriteAttributes method. The code is rather complicated, but I see no code path that would cause xsi:type attributes to be skipped (other than the second workaround I describe below).

Is there a workaround?

I would just ignore UnknownNode and UnknownAttribute events for xsi:type:

private static void serializer_UnknownNode(object sender, XmlNodeEventArgs e)
{
    if (e.NodeType == XmlNodeType.Attribute &&
        e.NamespaceURI == XmlSchema.InstanceNamespace && e.LocalName == "type")
    {
        // Ignore xsi:type attributes.
    }
    else
    {
        // [Log it...]
    }
}

// [And similarly for UnknownAttribute using e.Attr instead of e...]

Another (hackier, IMO) workaround is to map xsi:type to a dummy property in the BaseT class:

[XmlInclude(typeof(DerivedT))]
public class BaseT
{
    [XmlAttribute("type", Namespace = XmlSchema.InstanceNamespace)]
    [DebuggerBrowsable(DebuggerBrowsableState.Never)] // Hide this useless property
    public string XmlSchemaType
    {
        get { return null; } // Must return null for XmlSerializer.Serialize to work
        set { }
    }
}
Michael Liu
  • 52,147
  • 13
  • 117
  • 150
0

I don't think that's the proper way of using XmlSerializer, even though you have the proper output with the warnings, in a more advanced scenario there's no telling what could go wrong.

You should use the actual derived type (aTypeObj.GetType()) or even Generics to get this sorted.

Pedro Luz
  • 973
  • 5
  • 14
  • Do you have documentation to support that statement? – Peter - Reinstate Monica Feb 22 '17 at 16:40
  • I don't really, I'm just saying that's the way I do it, all my serialization routines are defined in helpers and is re-used throughout the application, using generics or .GetType() – Pedro Luz Feb 23 '17 at 12:00
  • 2
    The point is, it is impossible to know what specific derived type the object has (note that, after user1892538's answer, I verified that the events are also thrown for members, not only for XML roots). And then [Jon Skeet seems to say](http://stackoverflow.com/a/32368306/6996876) I should do exactly what I'm doing, so I suppose I'm using the serializer properly. – Peter - Reinstate Monica Feb 23 '17 at 12:18
  • I don't think you should mention my deleted answer and I prefer to delete also this comment if you don't reference me at all. Btw I quoted msdn: `you can declare valid types only on a single field or property, instead of declaring derived types at the base class. You can attach XmlElement, XmlAttribute, or XmlArrayItem attributes to a field and declare the types that the field or property can reference. Then the constructor of the XmlSerializer will add the code required to serialize and deserialize those types to the serialization classes` and I verified unknown events were not present. –  Feb 25 '17 at 08:38
0

Have you tried the XMLSerializer constructor where you can pass the derived type as one of the extraTypes?

Look here: https://msdn.microsoft.com/en-us/library/e5aakyae%28v=vs.110%29.aspx

You can use it like this:

var serializer = new XmlSerializer(typeof(BaseT), new Type[] { typeof(DerivedT) });

Of course in general you may want to retrieve the list of dervied types from somewhere else. For example from another assembly.

Robert S.
  • 1,942
  • 16
  • 22