3

I'm trying to create a simple App which reads a XML using SAX (XmlTextReader) from a stream which does not only contain the XML but also other data such as binary blobs and text. The structure of the stream is simply chunk based.

When entering my reading function, the stream is properly positioned at the beginning of the XML. I've reduced the issue to the following code example:

string xml = "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"yes\"?><Models />" + (char)0x014;

XmlTextReader reader = new XmlTextReader(new StringReader(xml));
reader.MoveToContent();
reader.ReadStartElement("Models");

These few lines causes an exception when calling ReadStartElement due to the 0x014 at the end of the string.

The interesting thing about it is, that the code runs just fine when using the following input instead:

string xml = "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"yes\"?><Models></Models>" + (char)0x014;

I don't want to read the whole document due to its size nor do I want to change the input as I need to stay backward compatible to older data inputs.

The only solution I can think of at first is a custom stream reader which doesn't continue to read after the last ending tag but that would involve some major parsing efforts.

Do you have any ideas on how to solve this issue? I've already tried to use LINQ's XDocument but that also failed.

Thank you very much in advance, Cheers,

Romout

Romout
  • 188
  • 2
  • 8
  • 1
    BTW, you should not use `new XmlTextReader()`. It has been deprecated since .NET 2.0. Use `XmlWriter.Create()` instead. – John Saunders Jan 28 '11 at 20:07
  • Also, is there any thought to giving you sensible data? Mixing binary and text (XML) doesn't make much sense, unless you can identify the XML blocks based on a header or something. – John Saunders Jan 28 '11 at 20:08
  • We have finally moved on the use additional header information to extract the XML data from the stream before finally parsing it. That approach works well and is supprisingly downwards compatible – Romout Mar 03 '11 at 14:03

1 Answers1

0

I don't know if this is quite what you are looking for, but if you instead call:

reader.IsStartElement("Models");,

than the <Models/> node will only be tested if it is a start tag or empty element tag and if the Name matches. The reader will not be moved beyond it (the Read() method will not be called).

Edwin de Koning
  • 14,209
  • 7
  • 56
  • 74