1

I have been given an XML file and an XSD file. I am trying to validate the XML against the XSD and then, using Serialization, load the the XML into an object.

I have the validation working as expected but when I try to DeserializeDocToObj I get the following error.

There was an error deserializing the object of type 
Aaa.Bbb.Common.DataTypes.SurveyGroup. Processing instructions
(other than the XML declaration) and DTDs are not supported. 
Line 1, position 2.

I have no idea what this means and all I have read is not really helping.

The header in the XSD:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" 
           xmlns="http://www.mydomain.co.uk/srm/mscc" 
           targetNamespace="http://www.mydomain.co.uk/srm/mscc" 
           elementFormDefault="qualified" 
           attributeFormDefault="unqualified">
<xs:element name="SurveyGroup">

The header in the XML

<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet type="text/xsl" href="mscc4_cctv.xsl"?>
<SurveyGroup xmlns="http://www.mydomain.co.uk/srm/mscc" 
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
             xsi:schemaLocation="
              http://www.mydomain.co.uk/srm/mscc 
              http://www.mydomain.co.uk/srm/schemas/mscc4_cctv.xsd">
<Survey>

Deserialization Code:

    public T DeserializeDocToObj(string fileLocation)
    {
        T returnObj;

        using (FileStream reader = new FileStream(fileLocation, FileMode.Open, FileAccess.Read))
        {
            DataContractSerializer ser = new DataContractSerializer(typeof(T));
            returnObj = (T)ser.ReadObject(reader);
        }

        return returnObj;
    }

Any help greatly appreciated

C. M. Sperberg-McQueen
  • 24,596
  • 5
  • 38
  • 65
Fred
  • 5,663
  • 4
  • 45
  • 74
  • Any chance you're file contains the dreaded BOM (byte order mark)? – Marvin Smit Dec 04 '13 at 09:06
  • @MarvinSmit Sorry I am still very much a beginner at this. How would I know? Is there a quick way for me look for it? Would XMLSpy tell me? – Fred Dec 04 '13 at 09:14
  • XMLSpy won't tell you, use a dos prompt and use "type filename". I you see weird characters before the xml starts, you're file on disk contains the "BOM". You'll have to 'skip' that BOM before trying to deserialize. (i.e. do a reader.Seek(4, seekOrigin.Begin);) before you call the deserialize. – Marvin Smit Dec 04 '13 at 12:47
  • Have a look at http://stackoverflow.com/questions/3255993/how-do-i-remove-i-from-the-beginning-of-a-file for more details on that BOM thingy ; – Marvin Smit Dec 04 '13 at 12:49
  • @MarvinSmit thanks for the responses. It turned out there was no BOM and I didn't find a solution to that problem really. However I changed the serialiser from `DataContractSerializer` to `XmlSerializer` and now have what I need (or thereabouts) – Fred Dec 04 '13 at 14:38

2 Answers2

2

Create an XmlReader with the correct XmlReaderSettings and call DataContractSerializer.ReadObject(XmlReader) instead of DataContractSerializer.ReadObject(Stream):

using (var reader = XmlReader.Create(fileName, new XmlReaderSettings { IgnoreProcessingInstructions = true }))
{
    var serializer = new DataContractSerializer(typeof(T));
    return (T)serializer.ReadObject(reader);
}

The XmlReader used by DataContractSerializer.Read(Stream) does not IgnoreProcessingInstructions. DataContractSerializer.Read(Stream) calls XmlDictionaryReader.CreateTextReader (see the source) which creates a XmlUTF8TextReader (see the source) which does not accept XmlReaderSettings.

Apparently the default behaviour is to crap on (unknown) processing instructions. And the string <?xml-stylesheet type="text/xsl" href="mscc4_cctv.xsl"?> is a processing instruction as C.M. Sperberg-McQueen states.

Kasper van den Berg
  • 8,951
  • 4
  • 48
  • 70
1

The string <?xml-stylesheet type="text/xsl" href="mscc4_cctv.xsl"?> is a processing instruction. Your software is telling you it cannot handle processing instructions in its input. This means that your software appears not to be an XML parser; you need either to restrict your input to the subset of XML it can handle, or get a real parser.

C. M. Sperberg-McQueen
  • 24,596
  • 5
  • 38
  • 65
  • Can someone cite the part of the standard that defines "if the software cannot handle processing instructions it is not an XML parser" (https://www.w3.org/TR/REC-xml/#sec-pi doesn't make any requirements about handling PIs). – Kasper van den Berg Jun 27 '18 at 08:46
  • I used the term "XML parser" in the ordinary-language sense of "a parser that parses XML documents", not as a term involving conformance. That XML documents can contain processing instructions follows from section 2.1 of the spec and productions [1], [27], and [16] (and others). I have no idea why I mentioned parsers, though, when OP's problem is clearly serialization. – C. M. Sperberg-McQueen Dec 24 '18 at 00:14