How the StringBuilder class is implemented? Does it internally create new string objects each time we append?
-
3+1 I learned something new from this question as well :) – Brian Rasmussen Aug 25 '10 at 10:45
-
1@Brian Rasmussen wait for Jon Skeet's answer. I bet it will be huge and full of new stuff to learn ;) – prostynick Aug 25 '10 at 11:01
-
Just a guess. It is chunked to avoid the LOH for large string. – Aug 26 '10 at 06:37
6 Answers
In .NET 2.0 it uses the String
class internally. String
is only immutable outside of the System
namespace, so StringBuilder
can do that.
In .NET 4.0 String
was changed to use char[]
.
In 2.0 StringBuilder
looked like this
public sealed class StringBuilder : ISerializable
{
// Fields
private const string CapacityField = "Capacity";
internal const int DefaultCapacity = 0x10;
internal IntPtr m_currentThread;
internal int m_MaxCapacity;
internal volatile string m_StringValue; // HERE ----------------------
private const string MaxCapacityField = "m_MaxCapacity";
private const string StringValueField = "m_StringValue";
private const string ThreadIDField = "m_currentThread";
But in 4.0 it looks like this:
public sealed class StringBuilder : ISerializable
{
// Fields
private const string CapacityField = "Capacity";
internal const int DefaultCapacity = 0x10;
internal char[] m_ChunkChars; // HERE --------------------------------
internal int m_ChunkLength;
internal int m_ChunkOffset;
internal StringBuilder m_ChunkPrevious;
internal int m_MaxCapacity;
private const string MaxCapacityField = "m_MaxCapacity";
internal const int MaxChunkSize = 0x1f40;
private const string StringValueField = "m_StringValue";
private const string ThreadIDField = "m_currentThread";
So evidently it was changed from using a string
to using a char[]
.
EDIT: Updated answer to reflect changes in .NET 4 (that I only just discovered).

- 114,645
- 34
- 221
- 317
-
Had no idea.. Think Im gonna do some reflector magic to satisfy my curiosity :) – cwap Aug 25 '10 at 10:33
-
@Brian: as far as I know it holds a `Char` array internally, not a `String` (at least in .NET 4, perhaps this has changed?) – Fredrik Mörk Aug 25 '10 at 10:39
-
@Fredrik - in the MS implementation, it really is a `string` that gets mutated – Marc Gravell Aug 25 '10 at 10:40
-
4@Marc: this got me curious so I checked with Reflector; looks like this has changed. It was a `string` before, now it seems to be a `char` array being manipulated instead. – Fredrik Mörk Aug 25 '10 at 10:42
-
-
@Fredrik: I was just going through the code in Reflector while you commented. I have updated the answer. – Brian Rasmussen Aug 25 '10 at 10:53
-
1http://www.nesterovsky-bros.com/weblog/2010/08/25/StringAndStringBuilderInNET4.aspx – Aug 25 '10 at 22:22
-
@Brian: NP. It was posted today so they could have easily copied your answer :) – Aug 26 '10 at 04:20
The accepted answer misses the mark by a mile. The significant change to StringBuilder
in 4.0 is not the change from an unsafe string
to char[]
- it's the fact that StringBuilder
is now actually a linked-list of StringBuilder
instances.
The reason for this change should be obvious: now there is never a need to reallocate the buffer (an expensive operation, since, along with allocating more memory, you also have to copy all the contents from the old buffer to the new one).
This means calling ToString()
is now slightly slower, since the final string needs to be computed, but doing a large number of Append()
operations is now significantly faster. This fits in with the typical use-case for StringBuilder
: a lot of calls to Append()
, followed by a single call to ToString()
.
You can find benchmarks here. The conclusion? The new linked-list StringBuilder
uses marginally more memory, but is significantly faster for the typical use-case.

- 4,122
- 4
- 47
- 81

- 84,206
- 33
- 197
- 283
Not really - it uses internal character buffer. Only when buffer capacity gets exhausted, it will allocate new buffer. Append operation will simply add to this buffer, string object will be created when ToString() method is called on it - henceforth, its advisable for many string concatenations as each traditional string concat op would create new string. You can also specify initial capacity to string builder if you have rough idea about it to avoid multiple allocations.
Edit: People are pointing out that my understanding is wrong. Please ignore the answer (I rather not delete it - it will stand as a proof of my ignorance :-)

- 47,395
- 5
- 59
- 72
-
1It acts *as though* it were a character buffer, but it really is a mutated `string` instance. Honest. – Marc Gravell Aug 25 '10 at 10:35
-
Thanks Marc - I was under impression that it uses character buffer. It means that it would have some native implementation to mutate string object. – VinayC Aug 25 '10 at 10:37
-
sure, but it is a core framework class. It has access to the native implementation. – Marc Gravell Aug 25 '10 at 10:41
-
1apols, it looks like (previous comments on this page) this has changed in .NET 4. – Marc Gravell Aug 25 '10 at 10:52
-
Never mind - I was under the same impression for even previous versions - so sure that I didn't even bother to check it via reflector. – VinayC Aug 25 '10 at 10:55
-
1Anyway, let's make it clear, this answer is the only correct answer for .NET 4.0 here for the time being, and it should not downplay itself as "incorrect". Some other answers have discovered the `char[]`, but that's a useless fact without noticing the ropes structure (`m_ChunkPrevious`) which is the real new thing here. – Jirka Hanika Aug 10 '12 at 22:03
I have made a small sample to demonstrate how StringBuilder works in .NET 4. The contract is
public interface ISimpleStringBuilder
{
ISimpleStringBuilder Append(string value);
ISimpleStringBuilder Clear();
int Lenght { get; }
int Capacity { get; }
}
And this is a very basic implementation
public class SimpleStringBuilder : ISimpleStringBuilder
{
public const int DefaultCapacity = 32;
private char[] _internalBuffer;
public int Lenght { get; private set; }
public int Capacity { get; private set; }
public SimpleStringBuilder(int capacity)
{
Capacity = capacity;
_internalBuffer = new char[capacity];
Lenght = 0;
}
public SimpleStringBuilder() : this(DefaultCapacity) { }
public ISimpleStringBuilder Append(string value)
{
char[] data = value.ToCharArray();
//check if space is available for additional data
InternalEnsureCapacity(data.Length);
foreach (char t in data)
{
_internalBuffer[Lenght] = t;
Lenght++;
}
return this;
}
public ISimpleStringBuilder Clear()
{
_internalBuffer = new char[Capacity];
Lenght = 0;
return this;
}
public override string ToString()
{
//use only non-null ('\0') characters
var tmp = new char[Lenght];
for (int i = 0; i < Lenght; i++)
{
tmp[i] = _internalBuffer[i];
}
return new string(tmp);
}
private void InternalExpandBuffer()
{
//double capacity by default
Capacity *= 2;
//copy to new array
var tmpBuffer = new char[Capacity];
for (int i = 0; i < _internalBuffer.Length; i++)
{
char c = _internalBuffer[i];
tmpBuffer[i] = c;
}
_internalBuffer = tmpBuffer;
}
private void InternalEnsureCapacity(int additionalLenghtRequired)
{
while (Lenght + additionalLenghtRequired > Capacity)
{
//not enough space in the current buffer
//double capacity
InternalExpandBuffer();
}
}
}
This code is not thread-safe, doesn't make any input validation and is not using the internal (unsafe) magic of System.String. It does however demonstrates the idea behind StringBuilder class.
Some unit-tests and full sample code can be found at github.

- 35,458
- 16
- 93
- 163
If I look at .NET Reflector at .NET 2 then I will find this:
public StringBuilder Append(string value)
{
if (value != null)
{
string stringValue = this.m_StringValue;
IntPtr currentThread = Thread.InternalGetCurrentThread();
if (this.m_currentThread != currentThread)
{
stringValue = string.GetStringForStringBuilder(stringValue, stringValue.Capacity);
}
int length = stringValue.Length;
int requiredLength = length + value.Length;
if (this.NeedsAllocation(stringValue, requiredLength))
{
string newString = this.GetNewString(stringValue, requiredLength);
newString.AppendInPlace(value, length);
this.ReplaceString(currentThread, newString);
}
else
{
stringValue.AppendInPlace(value, length);
this.ReplaceString(currentThread, stringValue);
}
}
return this;
}
So it is a mutated string instance...
EDIT Except in .NET 4 it is a char[]
If you want to see one of the possible implementations (That is similar to the one shipped wit the microsoft implementation up to v3.5) you could see the source of the Mono one on github.

- 17,397
- 4
- 57
- 75