1

I'm trying to crash-test my parser with the following string:

var theWholeUTF8 = new StringBuilder();
for (char code = Char.MinValue; code <= Char.MaxValue; code++)
{
        theWholeUTF8.Append(code);
}

However, the test crashes itself while building the string and throws OutOfMemoryException. What am I missing?

svick
  • 236,525
  • 50
  • 385
  • 514
Mike Roll
  • 928
  • 2
  • 7
  • 19
  • Read this link: http://stackoverflow.com/questions/1769447/interesting-outofmemoryexception-with-stringbuilder – Eldar Agalarov Aug 11 '13 at 20:13
  • 1
    `theWholeUTF8` isn’t really an accurate variable name; UTF-8 is an encoding, and .NET strings use UTF-16. – Ry- Aug 11 '13 at 20:17
  • Also, that string won't actually be valid UTF-16. And it won't contain all Unicode code points. – svick Aug 12 '13 at 12:03
  • @MikeRoll Because of [surrogate pairs](http://en.wikipedia.org/wiki/UTF-16#Code_points_U.2B10000_to_U.2B10FFFF). – svick Aug 12 '13 at 19:45

1 Answers1

11

The problem is tha code overflows and returns to 0 after being Char.MaxValue. The for cycle then doesn't end.

Try

var theWholeUTF8 = new StringBuilder();

for (int code = Char.MinValue; code <= Char.MaxValue; code++)
{
    theWholeUTF8.Append((char)code);
}

To make it clear... At a certain point

code = Char.MaxValue - 1

code++; // code == Char.MaxValue
is code <= Char.MaxValue? Yes
theWholeUTF8.Append((char)code);

code++; // code == 0
is code <= Char.MaxValue? Yes
theWholeUTF8.Append((char)code);

and so on!

One possible solution is to use for the code a bigger variable. Another solution is:

for (char code = Char.MinValue; code < Char.MaxValue; code++)
{
    theWholeUTF8.Append(code);
}

theWholeUTF8.Append(Char.MaxValue);

where we stop when code == Char.MaxValue and we add manually the Char.MaxValue .

Other solution, obtained by moving the check BEFORE the addition:

char code = Char.MinValue;

while (true)
{
    theWholeUTF8.Append(code);

    if (code == Char.MaxValue)
    {
        break;
    }

    code++;
}
xanatos
  • 109,618
  • 12
  • 197
  • 280
  • Aha that's it! Or use `char < Char.MaxValue` instead of `char <= Char.MaxValue`. It can never be greater than its max value :) – Alex MDC Aug 11 '13 at 20:16
  • @AlexMDC But the he would lose the Char.MaxValue "value" and he would have to add it manually. – xanatos Aug 11 '13 at 20:17
  • Good point. That's important if you are trying to "crash-test" a piece of code. – Alex MDC Aug 11 '13 at 20:18