0

I'm in the process of deserializing into C# objects a custom inflexible XML schema to traverse and migrate the data within.

A brief example:

  <Source>
    ...
    <Provider>
      <![CDATA[read 1]]>
      <Identifier><![CDATA[read 2]]></Identifier>
      <IdentificationScheme><![CDATA[read 3]]></IdentificationScheme>
    </Provider>
    ...
  </Source>

I'm looking the deserialize the Provider element with the first CDATA element value, read 1, and it's sibling element values too, read 2 and read 3.

Using http://xmltocsharp.azurewebsites.net/ it produces the following objects:

[XmlRoot(ElementName = "Provider")]
public class Provider
{
    [XmlElement(ElementName = "Identifier")]
    public string Identifier { get; set; }
    [XmlElement(ElementName = "IdentificationScheme")]
    public string IdentificationScheme { get; set; }
}

[XmlRoot(ElementName = "Source")]
public class Source
{
    [XmlElement(ElementName = "Provider")]
    public Provider Provider { get; set; }
}

But it fails to account for the the CDATA value, in fact I think deserializing it like this the value would not be reachable.

I think this maybe also be related to the XmlDeserializer to use, I was planning on RestSpharp's (as it's a library to the website already) or System.Xml.Link.XDocument, but I'm not sure whether either can handle this scenario?

In my searches I couldn't find an example either, but stack did suggest this <!{CDATA[]]> and <ELEMENT> in a xml element that is precisely the same schema option.

Thanks so much for any help in advance,

EDIT 1 As far as I can tell the [XmlText] is the solution required, as pointed out in Marc Gravell's answer below, but it does not work/is implemented on RestSharp's XmlDeserializer, but further testing would be required to ascertain that for sure.

Pedro Costa
  • 2,968
  • 2
  • 18
  • 30

1 Answers1

1

The CDATA is essentially just escaping syntax and is handled by most readers. What you are looking for is:

[XmlText]
public string WhateverThisIs { get; set; }

on the object that has raw content. By adding that to Provider, WhateverThisIs gets the value of "read 1". The other 2 properties already deserialize correctly as "read 2" and "read 3" without you having to do anything.

For reference, everything here would behave almost the same without the CDATA (there are some whitespace issues):

<Provider>
  read 1
  <Identifier>read 2</Identifier>
  <IdentificationScheme>read 3</IdentificationScheme>
</Provider>
Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • yah it's called mixed content, just found out via depper searching, however the solution on https://stackoverflow.com/a/25997011/271433 is an array, i assume that as long as there is only one CDATA your answer will work? – Pedro Costa Apr 07 '18 at 16:39
  • @PedroCosta forget about CDATA - that isn't the problem here (see my edit); but yes, this will work fine as long as you only have one *random piece of content in there*. – Marc Gravell Apr 07 '18 at 16:40
  • ummm, everything but the `Text` property is filled in with RestSharp's Deserializer `xmlDeserializer.Deserialize>(response)`, time to try the default one :D – Pedro Costa Apr 07 '18 at 17:07
  • using default works a treat :D `List results = ((EFG)serializer.Deserialize(reader)).Av;` but did have to be stricter with the root node hence the `EFG` – Pedro Costa Apr 07 '18 at 17:22
  • sorry should clarify, `EFG` is document root `AV` is the node that has a `Source` and `Provider` like the schema in the question, comments were just regarding the Deserialization method that the answer supports. – Pedro Costa Apr 07 '18 at 17:27