0

We have a project with "custom objects and attributes" created on a java server, and need this data on a C# client.

e.g. Custom object 'A' has attributes 'B', 'C'. Both 'B' and 'C' are described by customer in run time. Server sends this to us in XML like:

<A>
    <B> B Data </B>
    <C> C Data </C>
</A>

We created a class implementing IXmlSerializable that reads/writes xml for server filling the custom attributes to a dictionary

public class CustomObject : IXmlSerializable
{
    private Dictionary<String, String> attributes;

    public void ReadXml(XmlReader reader)
    {
        attributes = XDocument.Parse(reader.ReadOuterXml()).Root.Elements()
            .ToDictionary(xElm => xElm.Name.LocalName, xElm => xElm.Value)
    }

    // More Serialization logic for IXmlSerializable is here
}

The project is slow and we want to use the faster DataContract Serialization. We tested on a sample by explicitly specifying hard coded [DataContract] on our attributes (like 'B', 'C'). However in our use case attributes are not known at compile time. We can query server for list of attributes on type 'A'.

How can we use DataContract for attributes defined at runtime

Kiti Azura
  • 15
  • 4
  • You could try implementing [`IExtensibleDataObject`](https://msdn.microsoft.com/en-us/library/system.runtime.serialization.iextensibledataobject.aspx) on your `CustomObject`, then extracting the XML using the trick in [ExtensionDataObject not marked as serializable](https://stackoverflow.com/questions/32056762). But, have you profiled to determine where the real problem is? – dbc May 26 '16 at 07:56
  • Why do you do `XDocument.Parse(reader.ReadOuterXml())`? That effectively parses the XML twice. If you're having problems with performance, you could replace that with [`ReadSubtree()`](https://msdn.microsoft.com/en-us/library/system.xml.xmlreader.readsubtree.aspx). – dbc May 26 '16 at 08:17
  • Thanks for the feedback @dbc, it seems XDocument is taking around 16 ms for every object and also adds to our memory signature. That is why we want DataContract. I will check the link you have shared. – Kiti Azura May 26 '16 at 08:24
  • @dbc there is more logic in the function. I just reduced it for demo. Will ReadSubtree() provide XElement ? XElement is required for further logic that is part of our project – Kiti Azura May 26 '16 at 08:28
  • Actually `XNode.ReadFrom()` would be better. Answer updated. – dbc May 26 '16 at 09:07
  • Testing your solution @dbc – Kiti Azura May 26 '16 at 10:00

1 Answers1

1

Explicit data contracts that allow for arbitrary, unknown elements are not supported by DataContractSerializer. XmlSerializer supports this via [XmlAnyElementAttribute], but as stated in the answer Using [XmlAnyElement], there is no identical functionality for data contracts.

Your class could implement IExtensibleDataObject. It is similar to [XmlAnyElement] and is intended for forward-compatible data contracts. Unfortunately, in that case the unknown elements are stored in an opaque ExtensionDataObject with no obvious way to access the values. While it is possible to extract the XML from such an object (see here) it's nonobvious and is unlikely to be more performant than your current code, as it requires re-serializing the ExtensionDataObject inside a wrapper class, then parsing the result.

One note about performance - when you do XDocument.Parse(reader.ReadOuterXml()), the reference source shows you are effectively parsing your XML, then streaming it through an XmlWriter to a StringWriter, then parsing the resulting string a second time. Rather than doing this, you can parse the XML only once by calling XNode.ReadFrom() on the incoming reader, like so:

public class CustomObject : IXmlSerializable
{
    private readonly Dictionary<String, String> attributes = new Dictionary<string, string>();

    public IDictionary<string, string> Attributes { get { return attributes; } }

    #region IXmlSerializable Members

    System.Xml.Schema.XmlSchema IXmlSerializable.GetSchema()
    {
        return null;
    }

    void IXmlSerializable.ReadXml(XmlReader reader)
    {
        var element = XElement.ReadFrom(reader) as XElement;
        if (element != null)
        {
            foreach (var item in element.Elements())
                attributes.Add(item.Name.LocalName, (string)item);
        }
    }

    void IXmlSerializable.WriteXml(XmlWriter writer)
    {
        // Do NOT write the wrapper element when writing.
        foreach (var pair in attributes)
        {
            writer.WriteElementString(pair.Key, pair.Value);
        }
    }

    #endregion
}

This should be more performant than your current class. For instance, in Web API performance issues with large dynamic XML the reported improvement for a similar optimization was 40%.

Update

For the best possible performance implementing IXmlSerializable you will need to read content directly from the XmlReader using bespoke code. The following, for instance, reads element names and values into the attributes dictionary:

    void IXmlSerializable.ReadXml(XmlReader reader)
    {
        if (reader.IsEmptyElement)
        {
            reader.Read();
            return;
        }
        reader.Read();
        while (reader.NodeType != XmlNodeType.EndElement)
        {
            switch (reader.NodeType)
            {
                case XmlNodeType.Element:
                    var key = reader.Name;
                    var value = reader.ReadElementContentAsString();
                    attributes.Add(key, value);
                    break;

                default:
                    // Comment, for instance.
                    reader.Read();
                    break;
            }
        }
        // Consume the EndElement
        reader.Read();
    }

See Proper way to implement IXmlSerializable? for some general guidelines on manually reading an element hierarchy correctly.

dbc
  • 104,963
  • 20
  • 228
  • 340
  • Tried this, I find your solution a valid performance issue, but it did not add a significant change, so we still need a better solution. ExtensionDataObject wont help either as we do need access to data from the custom attributes. – Kiti Azura May 26 '16 at 14:29
  • @KitiAzura - the most performant way to parse the XML then will be to do it manually. From [Chapter 9 — Improving XML Performance](https://msdn.microsoft.com/en-us/library/ff647804.aspx): *If you want to read the document once, use [XmlReader]. This provides forward-only, read-only, and non-cached access to XML data. This model provides optimized performance and memory conservation.* Also see [Effective Xml Part 1: Choose the right API](https://blogs.msdn.microsoft.com/xmlteam/2011/09/14/effective-xml-part-1-choose-the-right-api/). – dbc May 26 '16 at 16:31
  • I think your above comment looks like the correct answer. At lease this is what I wanted to do, before we came across DataContract. – Kiti Azura May 27 '16 at 06:13
  • @KitiAzura - I added a bespoke version of `ReadXml()` that works directly with the `XmlReader`. Beyond this you may need to profile and share more details about your measured performance bottlenecks. – dbc May 28 '16 at 23:50