2

I have to load and deserialize an Xml file into an object. I can read the xml, get to the point where the object is described and parse the xml only from that part which is great, but there is a namespace declared in the root of the xml.

I don't understand why but when reading the xml, even though I read it from a given node, the xmlns attribute gets added to it, resulting in my program not being able to deserialize that into an object, due to the unexpected member.

My code:

public static SomeClass GetObjectFromXml (string path)
    {
        XmlReader reader = XmlReader.Create(path);
        string wantedNodeContents = string.Empty;
        while (reader.Read())
        {
            if (reader.NodeType == XmlNodeType.Element && reader.Name == "IWantThis")
            {
                wantedNodeContents = reader.ReadOuterXml();
                break;
            }
        }
        XmlSerializer xmlSerializer = new XmlSerializer(typeof(SomeClass));
        System.IO.StringReader stringReader = new System.IO.StringReader(wantedNodeContents);
        SomeClass loadedSomeClassXml = xmlSerializer.Deserialize(stringReader) as SomeClass;
        return loadedSomeClassXml;
    }

How could I get rid of the xmlns and deserialize the xml into an object?

66Gramms
  • 769
  • 7
  • 23
  • It's added to the outer XML because some containing node has a [default namespace](https://www.w3.org/TR/xml-names/#defaulting) which, being a default, applies to your `` node. Since `ReadOuterXml()` is designed to not change the semantics of the XML being read, it has to add in a default namespace to the XML returned. – dbc Nov 04 '20 at 18:12
  • I see, so what approach could I take? – 66Gramms Nov 04 '20 at 19:05

2 Answers2

2

You have a few issues here:

  1. The default namespace attribute is added to the string returned by ReadOuterXml() because ReadOuterXml() is designed not to change the semantics of the returned XML. Apparently in your XML there is a default namespace applied to some parent node of <IWantThis> -- which, being a default namespace, recursively applies to <IWantThis> itself. To retain this namespace membership, ReadOuterXml() must emit a default namespace as it writes out the nested XML.

    If you really want to completely ignore namespaces on XML, you need to create a custom XmlReader, e.g. as shown in

  2. You need to construct an XmlSerializer for SomeClass whose expected root node is <IWantThis>. You can do this using the XmlSerializer(Type, XmlRootAttribute) constructor, however, if you do, you must statically cache and reuse the serializer to avoid a severe memory leak, as explained in Memory Leak using StreamReader and XmlSerializer.

  3. You are creating a local copy wantedNodeContents of the element you want to deserialize, then re-parsing that local copy. There is no need to do this, you can use XmlReader.ReadSubtree() to deserialize just a portion of the XML.

Putting all these issues together, your GetObjectFromXml() could look like:

public static partial class XmlExtensions
{
    public static T GetObjectFromXml<T>(string path, string localName, string namespaceURI, bool ignoreNamespaces = false)
    {
        using (var textReader = new StreamReader(path))
            return GetObjectFromXml<T>(textReader, localName, namespaceURI);
    }
    
    public static T GetObjectFromXml<T>(TextReader textReader, string localName, string namespaceURI, bool ignoreNamespaces = false)
    {
        using (var xmlReader = ignoreNamespaces ? new NamespaceIgnorantXmlTextReader(textReader) : XmlReader.Create(textReader))
            return GetObjectFromXml<T>(xmlReader, localName, namespaceURI);
    }
    
    public static T GetObjectFromXml<T>(XmlReader reader, string localName, string namespaceURI)
    {
        while (reader.Read())
        {
            if (reader.NodeType == XmlNodeType.Element && reader.LocalName == "IWantThis" && reader.NamespaceURI == namespaceURI)
            {
                var serializer = XmlSerializerFactory.Create(typeof(T), localName, namespaceURI);
                using (var subReader = reader.ReadSubtree())
                    return (T)serializer.Deserialize(subReader);
            }
        }
        // Or throw an exception?
        return default(T);
    }
}

// This class copied from this answer https://stackoverflow.com/a/873281/3744182
// To https://stackoverflow.com/questions/870293/can-i-make-xmlserializer-ignore-the-namespace-on-deserialization
// By https://stackoverflow.com/users/48082/cheeso
// helper class to ignore namespaces when de-serializing
public class NamespaceIgnorantXmlTextReader : XmlTextReader
{
    public NamespaceIgnorantXmlTextReader(System.IO.TextReader reader): base(reader) { }

    public override string NamespaceURI { get { return ""; } }
}

public static class XmlSerializerFactory
{
    // To avoid a memory leak the serializer must be cached.
    // https://stackoverflow.com/questions/23897145/memory-leak-using-streamreader-and-xmlserializer
    // This factory taken from 
    // https://stackoverflow.com/questions/34128757/wrap-properties-with-cdata-section-xml-serialization-c-sharp/34138648#34138648

    readonly static Dictionary<Tuple<Type, string, string>, XmlSerializer> cache;
    readonly static object padlock;

    static XmlSerializerFactory()
    {
        padlock = new object();
        cache = new Dictionary<Tuple<Type, string, string>, XmlSerializer>();
    }

    public static XmlSerializer Create(Type serializedType, string rootName, string rootNamespace)
    {
        if (serializedType == null)
            throw new ArgumentNullException();
        if (rootName == null && rootNamespace == null)
            return new XmlSerializer(serializedType);
        lock (padlock)
        {
            XmlSerializer serializer;
            var key = Tuple.Create(serializedType, rootName, rootNamespace);
            if (!cache.TryGetValue(key, out serializer))
            {
                cache[key] = serializer = new XmlSerializer(serializedType, new XmlRootAttribute { ElementName = rootName, Namespace = rootNamespace });
            }
            return serializer;
        }
    }
}

Demo fiddle here.

dbc
  • 104,963
  • 20
  • 228
  • 340
1

XDocument provides you a bit of more flexibility at time of deserialize any XML. I had a similiar problem and it was resolve using the next snippet code:

///Type T must have a default constructor

private T XMLToObject (string pathXML)
{
   T myObjectParsedFromXML= default(T);

   LoadOptions loadOpt = LoadOptions.SetLineInfo;
   XDocument xmlDocument = XDocument.Load(pathXML , loadOpt);

   string namespaceXML = xmlDocument.Root.Name.Namespace.NamespaceName;
   XmlSerializer serializer = new XmlSerializer(typeof(T), defaultNamespace: namespaceXML); 
   
   XmlReader XMLreader = xmlDocument.CreateReader();

   myObjectParsedFromXML= (T)serializer.Deserialize(XMLreader);   
   
   return myObjectParsedFromXML;
}

In addition, XmlSerializer provides you a set of events for register any issue or error during serialization process:

 XmlSerializer serializer = new XmlSerializer(typeof(T), defaultNamespace: namespaceXML);
 
 serializer.UnknownAttribute += new XmlAttributeEventHandler((sender, args) =>
            {
                //Your code for manage the errors during serialization
            });

 serializer.UnknownElement += new XmlElementEventHandler((sender, args) =>
            {  
               //Your code for manage the errors during serialization  
            });

  • This seemed to almost work, but now the namespace just gets to the last node for me, again resulting in an exception. – 66Gramms Nov 04 '20 at 13:28