55

I am writing an XML validator with XSD.

Below is what I did, but when the validator reached the line while (list.Read()) it gives me the error

There is no Unicode byte order mark. Cannot switch to Unicode.

Can anybody help me fix it?

public class Validator
    {
        public void Validate(string xmlString)
        {
            Boolean bRet = true;
            string xmlPath = @"C:\x.xml";
            string xsdPath = @"C:\general.xsd";

            XmlReaderSettings Settings = new XmlReaderSettings();
            Settings.Schemas.Add("", xsdPath);
            Settings.ValidationType = ValidationType.Schema;
            Settings.ValidationEventHandler += 
               new ValidationEventHandler(SettingsValidationEventHandler);

            XmlReader list = XmlReader.Create(xmlPath, Settings);
            //StringBuilder output = new StringBuilder();
            while (list.Read()) 
            {
            }
            //File.WriteAllText(@"D:\Output.xml", output.ToString());
        }
        static void SettingsValidationEventHandler(object sender,
                                                   ValidationEventArgs e)
        {
            if (e.Severity == XmlSeverityType.Warning)
            {
                MessageBox.Show( "WARNING: ");
                MessageBox.Show(e.Message);
            }
            else if (e.Severity == XmlSeverityType.Error)
            {
                MessageBox.Show("ERROR: ");
                MessageBox.Show(e.Message);
            }
        }
    }

XML

<?xml version="1.0" encoding="utf-16"?>
<FlashList xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xmlns:xsd="http://www.w3.org/2001/XMLSchema" vin="xxxxxxxxxxxxx">
  <flash ECUtype="xxx" />
</FlashList>

XSD

<?xml version="1.0" encoding="utf-16"?>
<xs:schema attributeFormDefault="unqualified" 
           elementFormDefault="qualified"
           xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="FlashList">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="flash" maxOccurs="unbounded" minOccurs="0">
          <xs:complexType>
            <xs:simpleContent>
              <xs:extension base="xs:string">
                <xs:attribute type="xs:string" name="ECUtype" use="optional"/>
              </xs:extension>
            </xs:simpleContent>
          </xs:complexType>
        </xs:element>
        <xs:element name="Error" maxOccurs="unbounded" minOccurs="0">
          <xs:complexType>
            <xs:simpleContent>
              <xs:extension base="xs:string">
                <xs:attribute type="xs:byte" name="code" use="optional" />
              </xs:extension>
            </xs:simpleContent>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
      <xs:attribute type="xs:string" name="vin"/>
    </xs:complexType>
  </xs:element>
</xs:schema>
kjhughes
  • 106,133
  • 27
  • 181
  • 240
user3122648
  • 887
  • 2
  • 8
  • 21
  • 3
    Are you sure the "physical" file x.xml is properly encoded? Open it with a text editor such as Sublime or jEdit, to check the actual encoding. – potame Apr 28 '15 at 10:23
  • yes, I have made this XML file on the server side using the c# generated class from the same xsd file and it is well formed. this code is on the client side and I just want to validate my received xml file with the same xsd on the client side also – user3122648 Apr 28 '15 at 10:38

4 Answers4

98

The reality of your file's encoding appears to conflict with that specified by your XML declaration. If your file actually uses one-byte characters, declaring encoding="utf-16" won't change it to use two-byte characters, for example.

Try removing the conflicting encoding from the XML declaration. Replace

<?xml version="1.0" encoding="utf-16"?>

with

<?xml version="1.0"?>

You may also be able to load the file into a string as a work-around using LoadXML().

kjhughes
  • 106,133
  • 27
  • 181
  • 240
  • 4
    FWIW: `` might do the trick too. – LosManos May 18 '16 at 14:32
  • 1
    Yes, because `utf-8` is the default encoding. – kjhughes May 18 '16 at 14:50
  • 12
    After encountering a similar error, this answer helped me solving my own problem. In my case, I was first creating the xml programmatically, then reading and writing to it at a later point. If you want to remove/change the encoding version in the processing instruction using `xmlwriter`, use `writer.WriteProcessingInstruction("xml", "version='1.0'");` (with `writer` being an instance of `XmlWriter`). See [msdn doc](https://msdn.microsoft.com/en-us/library/system.xml.xmlwriter.writeprocessinginstruction(v=vs.110).aspx) – Alexis Le Compte Oct 12 '16 at 09:23
  • 1
    The workaround "You may also be able to load the file into a string as a work-around using LoadXML()." worked for me. – David Smith Mar 29 '19 at 14:03
  • But the question is if the workaround is safe to be implemented? – Jakub G Jul 01 '21 at 10:02
3

If you are not able to change the xml file encoding as

<?xml version="1.0"?>

Alternatively, you can read the xml content directly as raw xml instead of loading it with xml path.

XmlReader.Create(new StringReader(File.ReadAllText(fileName)));

If you use XmlDocument;

var xmlDoc = new XmlDocument();
xmlDoc.LoadXml(File.ReadAllText(filePath));
lucky
  • 12,734
  • 4
  • 24
  • 46
  • 2
    Do not use `File.ReadAllText`. Always create a `StreamReader` and `FileStream`. Never allocate file-sized chunks in memory. – Mr. TA Jun 28 '20 at 19:31
  • 2
    @Mr.TA If it is a known, small file, like settings or whatever File.ReadAllText is perfectly OK. – A.R. May 13 '21 at 02:14
3

This error is thrown, when you declare encoding by UTF-16 in XML head, but physically don't save this file in such encoding.

You can check using simple Windows Notepad, clicking to Save As, and then in the bottom check encoding of xml file (probably it is still UTF-8, instead of UTF-16).

Screenshot of notepad encoding setting

Jakub G
  • 189
  • 1
  • 9
0

You can use a StreamReader to set the encoding:

  return (TReport)xmlSerializer.Deserialize(
      new StreamReader(
          new FileStream(filename, FileMode.Open, FileAccess.Read), Encoding.UTF8));

Depending on your application, it might not be optimal to use a string to pass the xml, consider a stream instead.

BJury
  • 2,526
  • 3
  • 16
  • 27