0

I have XML with the element: <DESCRIPTION>fault – No reply</DESCRIPTION>

I am transforming the Xml from a web-service as follows based on Jon Skeet's code https://stackoverflow.com/a/427737/197229 (the original Xml validates fine):

public sealed class StringWriterUTF8 : StringWriter
{
    public override Encoding Encoding
    {
        get { return Encoding.UTF8; }
    }
}

    WebRequest request = WebRequest.Create(url);
    WebResponse response = request.GetResponse();
    Stream stream = response.GetResponseStream();
    StreamReader streamReader = new StreamReader(stream);
    string xml = streamReader.ReadToEnd();
    logger.Log().Debug(String.Format("Received Xml:\n{0}", xml));
    if(Transform != null)
    {
        using (var stringReader = new StringReader(xml))
        using (var xmlReader = XmlReader.Create(stringReader))
        using (var stringWriter = new StringWriterUTF8())
        using (var xmlTextWriter = XmlWriter.Create(stringWriter, new XmlWriterSettings()
            { Indent= true}))
        {
            Transform.Transform(xmlReader,xmlTextWriter);
            xml = stringWriter.ToString();
            logger.Log().Debug(String.Format("Transformed Xml:\n{0}", xml));
        }
    }

Everything looks great... but the generated XML is failing validation when I try to use it, even though to the naked eye it looks fine. If I remove that hyphen, there are no problems.

I don't understand why the original XML is fine and the .Net classes are getting tripped up, but if I try and validate the Xml in Notepad++ I get this:

Input is not proper UTF-8, indicate encoding ! Bytes: 0x96 0x20 0x4E 0x6F

How can I resolve this? All I want to do is receive Xml and transform it to a new Xml file without encoding weirdness!

Community
  • 1
  • 1
Mr. Boy
  • 60,845
  • 93
  • 320
  • 589
  • It appears, Notepad++ is using the BOM (Byte Order Mark) to determine encoding. Microsoft xml classes do not always generate xml with a BOM. – Kevin Jul 11 '16 at 16:41
  • What are the `xsd` settings for the `description` element ? – Veverke Jul 11 '16 at 16:51
  • I would strongly suggest to create an `XmlReader` over the `response.GetResponseStream()`. As for the generated XML, where and how do you write it to a file you open in your editor? – Martin Honnen Jul 11 '16 at 16:51
  • @Veverke `xsd:string` – Mr. Boy Jul 12 '16 at 07:49
  • @MartinHonnen I will try that. And I am writing all the XML to a log file so I can copying the relevant bits to a new file... but the error is in my C# code before I thought to do that. I think you may be implying I could be introducing more errors/confusion in my attempts to save the XML which I can certainly imagine to be the case, it's all a bit awkward – Mr. Boy Jul 12 '16 at 07:51
  • @Mr.Boy, yes, obviously if you first create a string from your XSLT but then write that string later to a file and a text editor complains about the file opening it then we need to see the code saving the string to a file. – Martin Honnen Jul 12 '16 at 10:20
  • Did you try creating a sample xml with an xsd that defines one of its fields as string, and then populate the xml corresonding field with a value such as `aaaa - bbbb` and see if you get the same error ? If you do not get an error, then we know the problem is in the encoding with which your original xml was saved/created. – Veverke Jul 13 '16 at 07:04

0 Answers0