26

I have written following function

public void TestSB()
{
  string str = "The quick brown fox jumps over the lazy dog.";
  StringBuilder sb = new StringBuilder();
  int j = 0;
  int len = 0;

  try
  {
     for (int i = 0; i < (10000000 * 2); i++)
     {
        j = i;
        len = sb.Length;
        sb.Append(str);
     }

    Console.WriteLine("Success ::" + sb.Length.ToString());
  }
  catch (Exception ex)
  {
      Console.WriteLine(
          ex.Message + " :: " + j.ToString() + " :: " + len.ToString());
  }
}

Now I suppose, that StringBuilder has the capacity to take over 2 billion character (2,147,483,647 to be precise).

But when I ran the above function it gave System.OutOfMemoryException just on reaching the capacity of about 800 million. Moreover, I am seeing widely different result on different PC having same memory and similar amount of load.

Can anyone please provide or explain me the reason for this?

Atur
  • 1,712
  • 6
  • 32
  • 42
  • 1
    I would take a look at http://stackoverflow.com/questions/363680/stringbuilder-for-string-concatenation-throws-outofmemoryexception and http://stackoverflow.com/questions/1733667/memory-free-up-string-builder-an-d-byte-in-c-out-of-memory-exception – Baz1nga Sep 24 '11 at 08:07
  • You will be able to approach the maimum better with `StringBuilder sb = new StringBuilder(10000000 * 1);` Using a(n initial) capacity is always a good idea with big collections. – H H Sep 24 '11 at 08:25

1 Answers1

39

Each character requires 2 bytes (as a char in .NET is a UTF-16 code unit). So by the time you've reached 800 million characters, that's 1.6GB of contiguous memory required1. Now when the StringBuilder needs to resize itself, it has to create another array of the new size (which I believe tries to double the capacity) - which means trying to allocate a 3.2GB array.

I believe that the CLR (even on 64-bit systems) can't allocate a single object of more than 2GB in size. (That certainly used to be the case.) My guess is that your StringBuilder is trying to double in size, and blowing that limit. You may be able to get a little higher by constructing the StringBuilder with a specific capacity - a capacity of around a billion may be feasible.

In the normal course of things this isn't a problem, of course - even strings requiring hundreds of megs are rare.


1 I believe the implementation of StringBuilder actually changed in .NET 4 to use fragments in some situations - but I don't know the details. So it may not always need contiguous memory while still in builder form... but it would if you ever called ToString.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • Whell, but why this behaviour can vary between different machines considering that allocation limit is handled by VM and not system itself? – Tigran Sep 24 '11 at 08:12
  • @Tigran: It can vary based on two things: the VM implementation (different major versions, different variations based on CPU architecture) and the implementation details of `StringBuilder` itself. Oh, and how much memory is available of course... – Jon Skeet Sep 24 '11 at 08:18
  • 3
    @Tigran I think because the StringBuilder effectively needs **contiguous** memory to allocate its contents and memory can be fragmented in different ways based on what the machine has been doing beforehand. You can get still possibly get OutOfMemory exceptions when there may be still be lots of physical RAM still free, because there is not enough *contiguous* memory. – Neil Fenwick Sep 24 '11 at 08:22
  • 1
    @jon agree, but what sounds strange to me that the guy saying to have very different results on machines with apparently same config. But you confirm actually my doubts on machines equality. – Tigran Sep 24 '11 at 08:24
  • I thought the whole idea of the string builder is it created a collection of strings, and only joined them on the .ToString() operation, which is why it's so much faster than string concatenation? – Rob Sep 24 '11 at 08:35
  • @Rob: No - at least that's not how it *used* to work. The reason it's usually been faster than string concatenation is that it can use a single mutable buffer (grown only when required) - whereas with string concatenation, each operation requires all the data to be copied as strings are immutable. See http://pobox.com/~skeet/csharp/stringbuilder.html for more information. – Jon Skeet Sep 24 '11 at 08:38
  • I noticed one more thing on my machine. While running it with VS2008 (.Net 3.5) it is throwing exception at lesser value of loop(i) then with VS2010(.Net 4.0) – Atur Sep 24 '11 at 08:55
  • 1
    @atur: Right - that corresponds with my footnote - the implementation of StringBuilder has changed in .NET 4. – Jon Skeet Sep 24 '11 at 08:58
  • The original example I tried was in .Net 4.0 where it was able to take upto 800 million. .Net 3.5 is crashing for 1/3 the value – Atur Sep 24 '11 at 09:01