1

I am having issues trying to read a self-closing XML element that is invalid. The XML is as such:

<a key='value'>
  <b key2='value2'>
    <c importantkey='importantvalue'>
  </b>
</a>

By using .NET's xmlDocument class and XPath, I am unable to retrieve element "c" as it is an invalid tag.

I do not have control over the XML as this is passed from an API. To be more specific, tumblr's API for XML will present video post in the above XML format. As a result, I am unable to retrieve the XML element. I can only retrieve up to a element.

Is there any workaround that allows me to retrieve 'c' element as a XML node?

kjhughes
  • 106,133
  • 27
  • 181
  • 240
Alan
  • 15
  • 7
  • 3
    That's not even valid XML, you cannot parse invalid XML with an XML parser. Do you know what it means to be self closing and how it's represented? I doubt it actually comes from that API like that, you must have "processed" it in an earlier step. – Jeff Mercado Jan 28 '16 at 02:53
  • 1
    self closing tag works like.. right..? – Alan Jan 28 '16 at 02:55

1 Answers1

1

First of all, there is a difference between invalid and not well-formed.

Your "XML" is not well-formed.

To make it well-formed, change

<c importantkey='importantvalue'>

to

<c importantkey='importantvalue'/>

or

<c importantkey='importantvalue'></c>

Until you make either change, the textual data you have is not XML, and you cannot expect any conformant XML processor to help you.

Community
  • 1
  • 1
kjhughes
  • 106,133
  • 27
  • 181
  • 240
  • Hi, understand that the slash or the closing tag makes it well-formed. However, tumblr's API does not put it that way. it is giving me instead of the valid ones. – Alan Jan 28 '16 at 03:19
  • 1
    Then Tumblr's API is not giving you XML, and you cannot use XML tools to process it. Repair it first by processing it as ***text*** before trying to use any XML processor on it. You might try using [**Tidy**](http://www.html-tidy.org/) to repair it. – kjhughes Jan 28 '16 at 03:27
  • then it's strange.. cause I was following the documentation on their site https://www.tumblr.com/docs/en/api/v1 – Alan Jan 28 '16 at 03:38
  • 1
    File a bug report (api@tumblr.com) and either (1) repair their broken response before processing it as XML or (2) wait for them to fix the bug. Just make sure you include a [mcve]. – kjhughes Jan 28 '16 at 03:45