1

The XML reader seems to be sensitive about white-space around empty elements.

If I have an empty element with no spaces (<B />) then the reader doesn't see it as an element.

public static void Main()
{
    WriteLine("No spaces around <B>.");
    using (var stringReader = new StringReader(@"<Index><A>a</A><B /><C>c</C></Index>"))
    using (var reader = XmlReader.Create(stringReader))
    {
        while (reader.Read())
        {
            if (reader.Name != "Index" && reader.NodeType == XmlNodeType.Element)
            {
                WriteLine("{0}: {1}", reader.Name, reader.ReadElementContentAsString());
            }
        }
    }
    WriteLine();
    WriteLine("Spaces added around <B>.");
    using (var stringReader = new StringReader(@"<Index><A>a</A> <B /> <C>c</C></Index>"))
    using (var reader = XmlReader.Create(stringReader))
    {
        while (reader.Read())
        {
            if (reader.Name != "Index" && reader.NodeType == XmlNodeType.Element)
            {
                WriteLine("{0}: {1}", reader.Name, reader.ReadElementContentAsString());
            }
        }
    }

    Read();
}

Printing out the NodeType values it looks like it does see it. Here I'm printing our the types found in order (minus the if statement above):

No spaces around <B>.      Spaces added around <B>.
     Index: Element         Index: Element
         A: Element             A: Element
          : Text                 : Text
         A: EndElement          A: EndElement
         B: Element              : Whitespace
         C: Element             B: Element
          : Text                 : Whitespace
         C: EndElement          C: Element
     Index: EndElement           : Text
                                C: EndElement
                            Index: EndElement

The problem seems to be with the statement:

reader.ReadElementContentAsString()

If I remove that statement then I get B appearing again. I thought that it might be something to do with that method moving the reader to the next node (?) but I can't seem to prove that, or work around it.

How should I handle empty nodes with the XmlReader?

BanksySan
  • 27,362
  • 33
  • 117
  • 216
  • Looks like a duplicate of [XmlReader - problem reading xml file with no newlines](https://stackoverflow.com/q/7196468/3744182) and [Why is my XML reader reading every other element?](https://stackoverflow.com/q/10038193/3744182). – dbc Oct 01 '17 at 23:07
  • And also [c# XMLReader skips nodes after using ReadElementContentAs](https://stackoverflow.com/a/24991311/3744182). – dbc Oct 01 '17 at 23:10

1 Answers1

0

In your code, ReadElementContentAsString advances the reader past the end of </A>, so that it is positioned on the empty element <B /> in the version without spaces and the text node in the version with spaces. The control the proceeds to the reader.Read() statement-expression in the head of the while loop, which advances the reader past that, effectively skipping it. So you need to modify the logic so that reader.Read() is not called in the head of your loop if ReadElementContentAsString is called in the body.

cynic
  • 5,305
  • 1
  • 24
  • 40