1

I have re-coded to avoid embedded null characters in C# strings,... but was wondering why the following calls gave no warning or exception for string parameters with an embedded null character, and whether this was a bug in StringBuilder.ToString(), a bad practice in general for C#, or at worst a vulnerability in .NET.

For background I have a WPF application that was parsing through an XPath to create nodes and attributes within an XmlDocument when needed. The StringBuilder class let me replace a path delimiter with a null character, e.g.: xpathtonode[i] = '\0';

Though this is allowed, if it were a bad practice I would hope to receive and exception or at least a warning.

The call to xmlpathtonode.ToString() would correctly return the string up to the null terminating character, except when the null character was embedded as the last character, then the null character would be included in the string returned by ToString(). Thus the string's Length property would be longer than the intended string value.

If StringBuilder.ToString() would recognize the null character at the end of the string and exclude it, there would not have been the following issue. Maybe this is just a bug in the StringBuilder class...

The subsequent call to XmlDocument.CreateAttribute(...), or even a call to exclude the embedded null character xpathtonode.ToString().Substring(offset,length) would exit the thread of execution without error or exception. My program and the debugger would continue to operate as if the call had never occurred,...

I doubt that this would be an OS style buffer overflow vulnerability,... but it is creepy to have the flow of execution interrupted and continue without any indication.

Bug? Bad Practice? Vulnerability?

ergohack
  • 1,268
  • 15
  • 27
  • @TMcKeown, elaborate on why. – Adrian Sanguineti Nov 04 '14 at 00:48
  • When you say `The call to xmlpathtonode.ToString() would correctly return the string up to the null terminating character` what do you mean? It should return the entire thing, including any text after the null character. – Rufus L Nov 04 '14 at 00:58
  • Bad practice. Why would you expect a null character in the middle of a string? A binary stream, perhaps, but not a string. – m_a_s Nov 04 '14 at 01:08
  • @RufusL When the null character was placed in the middle of a string, the call to `ToString()` would only return the characters preceding the null character, not the entire string with the null character in the middle. When the null character was at the end of the string, `ToString()` would include the null character with the preceding characters. – ergohack Nov 04 '14 at 23:17

3 Answers3

1

In your problem statement, you said,

The StringBuilder class let me replace a path delimiter with a null character, e.g.: xpathtonode[i] = '\0';

Though this is allowed, if it were a bad practice I would hope to receive and [sic] exception or at least a warning.

U+0000 (Ascii NUL) is a perfectly legal Unicode control character and a perfectly legal character in a .Net string: .Net strings aren't nul-terminated: they carry a length specifier around with them.

You might use a more appropriate Unicode/ASCII control character for this:

  • U+001C (FS) is File Separator.
  • U+001D (GS) is Group Separator.
  • U+001E (RS) is Record Separator.
  • U+001F (US) is Unit Separator.

Back in the old days (history lesson coming), when men were men, data was persisted to paper tape or punch cards.

In particular, on paper tape, fields within a file record would be separated with US, the unit separator. Groups of fields (e.g., repeating fields or a group of related fields) might be delimited with GS (group separator). Individual records within a file would be delimited with RS (record separator) and individual files on the tape with FS the file separator.

Punch cards were a little different since cards were discrete things. Each record was often (but not always!) on a single punch card. And a "file" might be 1 or more boxes of punch cards.

Nicholas Carey
  • 71,308
  • 16
  • 93
  • 135
  • 1
    You have a good point that control characters should be supported within a string. But if the null character, as with other control characters were fully supported as a character why does the `ToString()` truncate the returned string, except with the null character is at the end? Is the mid string truncation a bug in your view or possibly an archaic nod to null terminated string functionality? What about the function calls that unceremoniously terminate when a string parameter contains a null character? – ergohack Nov 04 '14 at 23:33
0

Bad practice because the \0 character can be interpreted differently by various features/functions of .NET giving you strange/unpredictable results, the real question is why you would purposely use that character.

Here is a similar question/response: Why is there no Char.Empty like String.Empty?

Community
  • 1
  • 1
T McKeown
  • 12,971
  • 1
  • 25
  • 32
  • Like you, the more seasoned C# programmers at my company were astonished and expressed how un-C# it was to null terminate anything. I was puzzled why .NET was so inconsistent and dysfunctional. – ergohack Nov 04 '14 at 23:45
  • To answer your question, I was parsing through an XPath to consume it in stages, e.g. given the full XPath `/site/survey/system[@IP="somehost.net"]/@name` I needed to first create the `site` node in my XmlDocument if it did not already exist, then the `survey` node, etc,... To check for the existence of a node with `SelectSingleNode(...)` I would need the partial path progressively through each node until the entire path existed. The xpath StringBuilder object allowed me to replace the delimiters with a null character progressively and returned just the partial path needed. But at the end,... – ergohack Nov 05 '14 at 00:03
0

Bug? Bad Practice? Vulnerability?

Specific to .NET's XmlDocument object, since you mention that calls to CreateAttribute(...) or xpathtonode.ToString().Substring(offset,length) cause the thread to be exited without error or exception then this appears to be a small bug. It would be bad practise for you to include the null character in any code because of this quirk.

However, this can also be classed as a vulnerability if you are constructing the path from user input, as a malicious user could include the null character on purpose to change the execution path of your code. It is good practise anyway to sanitize any user controlled or external data in XPath queries, as otherwise your code would be vulnerable to XPath Injection:

XPath Injection attacks occur when a web site uses user-supplied information to construct an XPath query for XML data. By sending intentionally malformed information into the web site, an attacker can find out how the XML data is structured, or access data that he may not normally have access to.

There are a few ways to avoid XPath Injection in .NET.

Regarding null bytes in general, and your StringBuilder example it appears that this may be a type of Off-by-one Error. If the StringBuilder does any string processing with user input, it may be possible for an attacker to provide a null terminated string and access the value of a character that they normally wouldn't have access to. It might also be possible for a user to supply a null terminated string and cause the program to discard whatever would normally follow in the string. These attacks would rely on the null value being persisted from the initial input location, as the pipeline may consistently terminate the input at the null byte. It is any inconsistency that is the problem.

For example, if one component treats the string 12345\06789 as 123456789 during validation, and another component treats the string as 12345 when the value is actually used then this is a problem. This was the cause of several PHP null byte related issues where PHP would read the null byte, but any system functions that were written in C classed them as a termination character. This made it possible to smuggle various strings past the PHP validation code and then enable the operating system to execute things it wasn't meant to as an aid to the attacker.

However, as .NET is a managed language this is unlikely to lead to a buffer overflow vulnerability. It might be worth further investigating if it is possible to do any of these by injecting null bytes from user input.

Community
  • 1
  • 1
SilverlightFox
  • 32,436
  • 11
  • 76
  • 145
  • It is worth testing my login prompts as they would be easy targets for null character injection in either the name or the password fields (see [how to simulate nul character from keyboard](http://stackoverflow.com/questions/9124786/how-to-simulate-nul-character-from-keyboard) ). – ergohack Nov 05 '14 at 16:21
  • @JRo: If you use an intercepting proxy such as Burp Suite you can enter the null encoded as `%00` to test inputs. – SilverlightFox Nov 05 '14 at 22:55