58

I'm getting this error when trying to parse through an XML document in a C# application:

"For security reasons DTD is prohibited in this XML document. To enable DTD processing set the ProhibitDtd property on XmlReaderSettings to false and pass the settings into XmlReader.Create method."

For reference, the exception occurred at the second line of the following code:

using (XmlReader reader = XmlReader.Create(uri))
{
    reader.MoveToContent(); //here

    while (reader.Read()) //(code to parse xml doc follows).

My knowledge of Xml is pretty limited and I have no idea what DTD processing is nor how to do what the error message suggests. Any help as to what may be causing this and how to fix it? thanks...

ConnorU
  • 1,379
  • 2
  • 15
  • 27
  • 2
    Have you taken the steps listed in the error message? If not, why do they not work for you? – user7116 Dec 13 '12 at 20:58
  • I'm a bit surprised you want to parse an XML document without knowing what a DTD is. How would you detect any errors in the input? – U. Windl Aug 29 '23 at 14:53

4 Answers4

87

First, some background.

What is a DTD?

The document you are trying to parse contains a document type declaration; if you look at the document, you will find near the beginning a sequence of characters beginning with <!DOCTYPE and ending with the corresponding >. Such a declaration allows an XML processor to validate the document against a set of declarations which specify a set of elements and attributes and constrain what values or contents they can have.

Since entities are also declared in DTDs, a DTD allows a processor to know how to expand references to entities. (The entity pubdate might be defined to contain the publication date of a document, like "15 December 2012", and referred to several times in the document as &pubdate; -- since the actual date is given only once, in the entity declaration, this usage makes it easier to keep the various references to publication date in the document consistent with each other.)

What does a DTD mean?

The document type declaration has a purely declarative meaning: a schema for this document type, in the syntax defined in the XML spec, can be found at such and such a location.

Some software written by people with a weak grasp of XML fundamentals suffers from an elementary confusion about the meaning of the declaration; it assumes that the meaning of the document type declaration is not declarative (a schema is over there) but imperative (please validate this document). The parser you are using appears to be such a parser; it assumes that by handing it an XML document that has a document type declaration, you have requested a certain kind of processing. Its authors might benefit from a remedial course on how to accept run-time parameters from the user. (You see how hard it is for some people to understand declarative semantics: even the creators of some XML parsers sometimes fail to understand them and slip into imperative thinking instead. Sigh.)

What are these 'security reasons' they are talking about?

Some security-minded people have decided that DTD processing (validation, or entity expansion without validation) constitutes a security risk. Using entity expansion, it's easy to make a very small XML data stream which expands, when all entities are fully expanded, into a very large document. Search for information on what is called the "billion laughs attack" if you want to read more.

One obvious way to protect against the billion laughs attack is for those who invoke a parser on user-supplied or untrusted data to invoke the parser in an environment which limits the amount of memory or time the parsing process is allowed to consume. Such resource limits have been standard parts of operating systems since the mid-1960s. For reasons that remain obscure to me, however, some security-minded people believe that the correct answer is to run parsers on untrusted input without resource limits, in the apparent belief that this is safe as long as you make it impossible to validate the input against an agreed schema.

This is why your system is telling you that your data has a security issue.

To some people, the idea that DTDs are a security risk sounds more like paranoia than good sense, but I don't believe they are correct. Remember (a) that a healthy paranoia is what security experts need in life, and (b) that anyone really interested in security would insist on the resource limits in any case -- in the presence of resource limits on the parsing process, DTDs are harmless. The banning of DTDs is not paranoia but fetishism.


Now, with that background out of the way ...

How do you fix the problem?

The best solution is to complain bitterly to your vendor that they have been suckered by an old wive's tale about XML security, and tell them that if they care about security they should do a rational security analysis instead of prohibiting DTDs.

Meanwhile, as the message suggests, you can "set the ProhibitDtd property on XmlReaderSettings to false and pass the settings into XmlReader.Create method." If the input is in fact untrusted, you might also look into ways of giving the process appropriate resource limits.

And as a fallback (I do not recommend this) you can comment out the document type declaration in your input.

C. M. Sperberg-McQueen
  • 24,596
  • 5
  • 38
  • 65
  • 2
    So tl;dr just do what the error message says? I doubt the OP opening a Connect entry with MS is going to solve their or your problems with DTD processing anytime soon. – user7116 Dec 13 '12 at 20:58
  • 1
    Thanks for the info, helped me udnerstand just why the problem was happening. As far as how to solve it, it's as simple as adding two lines of extra code. – ConnorU Dec 19 '12 at 16:48
  • 2
    Although the resource exhaustion attack is a concern, a more significant concern has now emerged in the External Entity Processing attack [documented here](https://www.owasp.org/index.php/XML_External_Entity_(XXE)_Processing). Effectively, it could allow an attacker to read files from your server, or network. The default setting is still probably the right one! – pattermeister Jan 24 '14 at 10:44
  • Thank you. That's another interesting case of a security analysis which makes no sense to me. The external entities in question are on the local system. The parser is running on the local system. The same-origin policy used in browsers would suffice to prevent access to them, and in any case nothing in the article you point to explains how the attacker gains access to the information. The assumption seems to assume that the server's parser is delivering the parsed data back to the source of the XML document instead of to some application running on the server; that ain't necessarily so. – C. M. Sperberg-McQueen Jan 27 '14 at 17:47
  • 2
    `System.Xml.XmlReaderSettings.ProhibitDtd is obsolete: Use XmlReaderSettings.DtdProcessing property instead.` See AaronD's answer. – Nicolas Raoul Mar 19 '15 at 06:12
  • As a pentester who focuses on application security, XML entity injection is a threat that is greatly under estimated. I've used it in many cases to access files on a server to grab passwords etc. I would ALWAYS, disable DTD resolution. If there was a reason to enable it, I would override the resolver and ensure that any external references are whitelisted. – Casey Sep 22 '16 at 19:55
52

Note that settings.ProhibitDtd is now obsolete, use DtdProcessing instead: (new options of Ignore, Parse, or Prohibit)

XmlReaderSettings settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Parse;

and as stated in this post: How does the billion laughs XML DoS attack work?

you should add a limit to the number of characters to avoid DoS attacks:

XmlReaderSettings settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Parse;
settings.MaxCharactersFromEntities = 1024;
Community
  • 1
  • 1
Dr. Aaron Dishno
  • 1,859
  • 1
  • 29
  • 24
29

As far as fixing this, with a bit of looking around I found it was as simple as adding:

XmlReaderSettings settings = new XmlReaderSettings();
settings.ProhibitDtd = false;

and passing these settings into the create method.

[UPDATE 3/9/2017]

As some have pointed out, .ProhibitDTDT is now deprecated. Dr. Aaron Dishno's answer, below, shows the superseding solution

Community
  • 1
  • 1
ConnorU
  • 1,379
  • 2
  • 15
  • 27
  • 29
    As of the latest (4.5.1) .Net framework, `.ProhibitDtd` is now obsolete, and one should use `settings.DtdProcessing = DtdProcessing.Ignore` for the above equivalent. – Karl Cassar Mar 14 '14 at 12:38
  • Would you mind to update your answer, please. The `ProhibitDtd` is deprecated now days. – NoWar Mar 01 '17 at 12:42
  • 1
    @Dimi, Dr. Aaron Dishno's answer, below mine, explains this and provides the current best way to do this as well as some excellent advice. I do not wish to take credit or rep from him by including his answer here as my own. – ConnorU Mar 09 '17 at 13:08
-3

After trying all of the above answers without success I changing the service user from service@mydomain.com to service@mydomain.onmicrosoft.com and now the app works correctly while running in azure.

Alternatively if you run into this problem in an environment you have more control over; you can paste the following into your hosts file:

127.0.0.1 msoid.onmicrosoft.com
127.0.0.1 msoid.mydomain.com
127.0.0.1 msoid.mydomain.onmicrosoft.com
127.0.0.1 msoid.*.onmicrosoft.com
aalesund
  • 313
  • 1
  • 4
  • 13