1

I have a xml file that contains an element without a name <> and I am not allowed to change the file. I am using XMLReader and Xnode to read the file.

var el = XNode.ReadFrom(reader) as XElement;

But I am getting an error regarding the empty tag

Name cannot begin with the '>' character

Here is an example for the structure of the xml.

<element1>
    <>
        <element2>
        </element2>
    </>
</element1>

How can I handle the case where a node has a missing name without changing the xml file?

doorman
  • 15,707
  • 22
  • 80
  • 145
  • 1
    I guess you can catch the exception. Then you know you encountered a `<>` then skip over it and continue reading the rest. – Sweeper Oct 15 '17 at 15:51
  • 3
    You could read it as a string, replace `<>` and `>` with `` and `` then parse the string. – Sani Huttunen Oct 15 '17 at 15:53
  • You could first read the file as text and replace all `<>` by `` and all `>` by ``. Then you parse it as valid xml. – oerkelens Oct 15 '17 at 15:54
  • 3
    *"I have a xml file that contains an element without a name `<>`"* That's impossible, then you don't have an XML file but a text file with a few angle brackets here and there. Instead of fixing the reading end, fix the producing end of this mess. – Tomalak Oct 15 '17 at 15:55
  • The problem is the xml file is quite large 6gb or more, hence I am XML streaming and not loading the whole file. Reading the file into memory and replacing the empty tags is not a possibility. Regarding the angle brackets they are opening and closing as seen in the example. – doorman Oct 15 '17 at 15:57
  • @doorman: You can't XML stream a non-XML file. You will have to read it as text. Read it line by line if you have to and replace the erroneous tags. – Sani Huttunen Oct 15 '17 at 15:59
  • You can stream the file and search-and-replace in the stream. There is no need to pull 6GB into RAM for that. – Tomalak Oct 15 '17 at 16:01
  • @SaniSinghHuttunen thanks for your feedback. Xml streaming works fine until I reach the element containing the empty tag. – doorman Oct 15 '17 at 16:01
  • @doorman: Exactly. Since it's not an XML file you are in trouble trying to XML stream it. Stream it as a text file or read line by line and replace the erroneous tags. – Sani Huttunen Oct 15 '17 at 16:06
  • @SaniSinghHuttunen I understand but I was hoping I could use XElement when streaming the file. – doorman Oct 15 '17 at 16:08
  • @Tomalak thanks for the suggestion. The problem is when the reader tries to read element1 it fails immediately because element1 contains the empty tag. Do you know how I can check if element1 contains empty tag? – doorman Oct 15 '17 at 16:09
  • Could your consuming code take `IEnumerable`? If so, you can construct an XElement on the fly and `yield return` it. – Tom Blodget Oct 15 '17 at 16:12
  • @TomBlodget good idea but the XMLReader fails when trying to readfrom it doesn't matter if I am casting to XElement or IEnumerable – doorman Oct 15 '17 at 16:15

1 Answers1

1

XMLReader is strict. Any non-conformance, it will throw an error.

So no, you can't skip malformed XML Elements unless you write your own XMLReader..

Cleaning up the input is probably the way to go

Allanckw
  • 641
  • 1
  • 6
  • 17
  • 1
    Here's an [example](https://blogs.msdn.microsoft.com/jmstall/2005/08/09/implementing-your-own-xmlreader-the-easy-way/) of a custom XMLTextReader – Tom Blodget Oct 15 '17 at 19:18