XMLTextReader is created but XslCompiledTransform.Transform fails with invalid character

Question

My Code:

using (XmlTextReader inputReader = new XmlTextReader(xml, XmlNodeType.Document, new XmlParserContext(null, null, "en", XmlSpace.Default)))
        {
            XsltArgumentList arglist = new XsltArgumentList();
            GetXSLT().Transform(inputReader, arglist, outputStream);
        }

The XmlTextReader is created fine, inside the XML there is an entity reference for a vertical tab ()

The line that errors is the call to Transform. It says that there is an invalid XML character (the vertical tab of course).

I've tried using the approach referenced in the following article:
Escape invalid XML characters in C#

My question is: how can I remove or ignore the invalid characters using the .NET framework like the link states?

note: in a way that doesn't involve hard coding a list of entity references to replace (I'm already doing this and it is horrible and I feel bad, and I should)

You can try [ignoring it](http://stackoverflow.com/a/2272525/11683) instead of removing. — GSerg, Feb 02 '15 at 16:43
You are ignoring them while reading, you should also ignore them while writing. — GSerg, Feb 02 '15 at 19:05
i'll mark as Answer if you can post a nice way to use `var validXmlChars = text.Where(ch => XmlConvert.IsXmlChar(ch)).ToArray();` to get the characters removed — Nateous, Feb 02 '15 at 20:03
Is the `XmlDoctor` here any help? https://stackoverflow.com/questions/27925128/removing-invalid-characters-from-xml-file-before-deserialization/27976613#27976613 — dbc, Feb 03 '15 at 01:42
@dbc that is a lot of code (too much) and it appears that it is hard coded in terms of which characters it is looking to replace, I'd rather use a regex. I am looking for a solution that relies on the MS .NET framework to tell which characters it needs to replace. — Nateous, Feb 03 '15 at 14:46
@GSerg ignoring might end up being a better solution, so far it seems to be working. I'm still testing it (has to go through a database, web page displaying, printed materials, etc.) — Nateous, Feb 03 '15 at 14:47
@GSerg I think your solution is best. put in an answer and I'll mark it as the answer. otherwise I'll post my code to close this loop, thanks for your help. — Nateous, Feb 04 '15 at 14:05

score 1 · Accepted Answer · answered Feb 04 '15 at 14:29

1

Try ignoring invalid XML characters both while reading and writing:

var readerSettings = new XmlReaderSettings() { CheckCharacters = false, ConformanceLevel = ConformanceLevel.Document };

using (var inputReader = XmlTextReader.Create(xml, readerSettings, new XmlParserContext(null, null, "en", XmlSpace.Default)))
{
    XsltArgumentList arglist = new XsltArgumentList();
    var xslt = GetXSLT();

    var writerSettings = xslt.OutputSettings.Clone();
    writerSettings.CheckCharacters = false;

    using (var outputWriter = XmlWriter.Create(outputStream, writerSettings))
    {
        xslt.Transform(inputReader, arglist, outputWriter);
    }
}

answered Feb 04 '15 at 14:29

GSerg

76,472
17
159
346

Thanks! I'll have to review what `ConformanceLevel = ConformanceLevel.Document` does to see if I need to add that to mine. – Nateous Feb 04 '15 at 16:04
1

@Nate My understanding was that it does the same as your `XmlNodeType.Document` parameter for the `XmlTextReader`'s constructor. – GSerg Feb 04 '15 at 16:05

XMLTextReader is created but XslCompiledTransform.Transform fails with invalid character

1 Answers1

Linked