3

A security scan of our C# source reported "Missing XML Validation" as a possible injection flaw. It cited https://cwe.mitre.org/data/definitions/112.html and other sources.

Its recommendation was:

Always enable validation when you parse XML. If enabling validation causes problems because the rules for defining a well-formed document are Byzantine or altogether unknown, chances are good that there are security errors nearby.

Example: The following code demonstrates how to enable validation when using XmlReader.XmlReader

Settings settings = new XmlReaderSettings();
settings.Schemas.Add(schema);
settings.ValidationType = ValidationType.Schema;
StringReader sr = new StringReader(xmlDoc);
XmlReader reader = XmlReader.Create(sr, settings);

I have an XSD schema available for validation. My question is, how do I load the XSD as an XmlSchema without duplicating the error of loading an XML file without validation?

If I read the XSD from the file system, I think I am just duplicating the same error (reading XML without validation). Is there a recommended way to do this?

Our first approach was to read the XSD from the file system, like:

XmlTextReader xsdReader = new XmlTextReader("MySchema.xsd"));
XmlSchema schema = XmlSchema.Read(xsdReader, ValidationCallback);  

But, I believe this causes the same error, reading the XML (in this case the XSD) without validation.

The approach that we are using now (that I think will pass the security scan) is to load the XSD from an embedded resource.

Stream xsdStream = Assembly.GetAssembly(typeof(MyType))
      .GetManifestResourceStream("MyNamespace.MySchema.xsd");  
if (xsdStream == null) throw ...
XmlSchema schema = XmlSchema.Read(xsdStream, ValidationCallback);   

We have not rescanned yet, but I suspect the embedded resource approach will pass. But, is there recommended or best practice approach to this?

Michael Levy
  • 13,097
  • 15
  • 66
  • 100
  • Can you handle the `settings.ValidationEventHandler`? So read the schema in first, setup the XmlReaderSettings next and then use XmlReader to load the document. – Trevor Oct 30 '19 at 17:27
  • Right, but loading the XSD as: XmlTextReader xsdReader = new XmlTextReader("MySchema.xsd")); XmlSchema schema = XmlSchema.Read(xsdReader, ValidationCallback); duplicates the original problem, readming XML without validation. – Michael Levy Oct 30 '19 at 17:34
  • Can you use `XmlSchema schema; using (StringReader xsdReader = new StringReader(xsdPath)) { schema = XmlSchema.Read(xsdReader, null); }` passing `null` into the read. Then your settings.ValidationEventHandler as I suggested would validate this; something like: `settings.ValidationEventHandler += (sender, args) => { if (args.Severity == XmlSeverityType.Error) throw new InvalidOperationException(args.Message); }` – Trevor Oct 30 '19 at 17:46
  • 1
    Hard for us to know exactly what your security vulnerability scan tool will object to. Have you tested whether either of the code fragments from [this answer](https://stackoverflow.com/a/14542621/3744182) to [Adding (Embedded Resource) Schema To XmlReaderSettings Instead Of Filename?](https://stackoverflow.com/q/14185087/3744182) actually cause this scan tool to complain? – dbc Oct 30 '19 at 18:25
  • @dbc, yes, that is basically the solution we've implemented. I was mostly wondering if there are other recommended approaches. (we have not rescanned yet, but I suspect the embedded resource will pass). – Michael Levy Oct 30 '19 at 18:28
  • 1
    Oh, but don't use `XmlTextReader`, it's deprecated according to the [docs](https://learn.microsoft.com/en-us/dotnet/api/system.xml.xmltextreader?view=netframework-4.8). Use `XmlReader.Create()` instead. I vaguely recall `XmlTextReader` has more permissive default settings for resolving external URLs and so might get flagged as being insecure. – dbc Oct 30 '19 at 18:30
  • Your security scan tool explains how to enable validation, but does it give any concrete examples of security problems that can be caused by lack of validation? – dbc Oct 30 '19 at 22:27

1 Answers1

1

Anyone who can write the phrase "If enabling validation causes problems because the rules for defining a well-formed document are Byzantine" is revealing that they know very little about XML; it seems they don't understand the difference between being valid and being well-formed, which is pretty fundamental. So you're having to find ways of getting around rules that aren't very smart. At this point you have to decide whether your objective is to make the system more secure, or to pass the security tests.

It's very hard to see what security vulnerabilities will be fixed by enabling validation.

Especially as you can write a schema that accepts any document as valid, and I bet your security tool will be happily content that you are obeying the rules even though you haven't increased security one iota.

When a schema processor loads a schema then it automatically validates that it is a valid schema. So there really isn't any risk. But whether your security scanner accepts that there isn't any risk is another matter entirely.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
  • 1
    One thought: `XmlTextReader` (used internally by `XmlSerializer`) was vulnerable to XML bomb attacks prior to .Net 4.0; see [XML bomb (Entity Injection) is by default taken care in .Net 4.0 but not in .Net 3.5. How? What changed?](https://stackoverflow.com/q/23096087/3744182) and [Is .NET XmlSerializer.Deserialize(TextReader) safe?](https://security.stackexchange.com/q/188567/124029). Validation against a schema that included string length restrictions might have protected against this sort of attack in earlier versions. – dbc Oct 30 '19 at 22:19
  • But in general I agree with *It's very hard to see what security vulnerabilities will be fixed by enabling validation.* Validation might prevent other problems however, such as deserialized objects being in some invalid state (e.g. missing required properties) and causing problems and bugs down the road. – dbc Oct 30 '19 at 22:24
  • I think it's highly unlikely that a length restriction in a schema will prevent entity-expansion attacks. The XML parser will almost certainly try to construct the string in its entirety before testing its length against rules in the schema, rather than testing the length continually as the string is built. – Michael Kay Oct 31 '19 at 08:25
  • Excellent points. Also, I'm not reading the xml from some public input. It is from the file system on a secured server. But, I need to get the gold star that says the code "passed security scan", even if the changes to make no practical sense. – Michael Levy Oct 31 '19 at 15:58
  • You have my sympathy. We had a customer reject an invoice containing a bank account number for payment the other day because some security bod decided sending bank account details by email was insecure. So we sent it again with punctuation between the digits. – Michael Kay Oct 31 '19 at 17:06