10

I have downloaded a stream as a byte[] 'raw' that is about 36MB. I then convert that into a string with

string temp = System.Text.Encoding.UTF8.GetString(raw)

Then I need to replace all "\n" with "\r\n" so I tried

 string temp2 = temp.Replace("\n","\r\n")

but it threw an "Out of Memory" exception. I then tried to create a new string with a StringBuilder:

string temp2 = new StringBuilder(temp).Replace("\n","\r\n").toString()

and it didn't throw the exception. Why would there be a memory issue in the first place (I'm only dealing with 36MB here), but also why does StringBuilder.Replace() work when the other doesn't?

Aeon2058
  • 527
  • 6
  • 22
  • I saw that question, but it has more to do with performance rather than memory usage. Also, this was more of a "what's going on behind the scene?" question than a "how do I fix it?" one. – Aeon2058 May 14 '13 at 12:13

4 Answers4

7

When you use:

string temp2 = temp.Replace("\n","\r\n")

for every match of "\n" in the string temp, the system creates a new string with the replacement.

With StringBuilder this doesn't happens because StringBuilder is mutable, so you can actually modify the same object without the need to create another one.

Example:

temp = "test1\ntest2\ntest3\n"

With First Method (string)

string temp2 = temp.Replace("\n","\r\n")

is equivalent to

string aux1 = "test1\r\ntest2\ntest3\n"
string aux2 = "test1\r\ntest2\r\ntest3\n"
string temp2 = "test1\r\ntest2\r\ntest3\r\n"

With Secon Method (StringBuilder)

string temp2 = new StringBuilder(temp).Replace("\n","\r\n").toString()

is equivalent to

Stringbuilder aux = "test1\ntest2\ntest3\n"
aux = "test1\r\ntest2\ntest3\n"
aux = "test1\r\ntest2\r\ntest3\n"
aux = "test1\r\ntest2\r\ntest3\r\n"
string temp2 = aux.toString()
Dani Corretja
  • 311
  • 1
  • 7
  • 1
    So if my string was 36MB long and had say 50,000 "\n" to replace, with string.Replace() this would require 36*50000MB to accomplish, and that's why there was a memory error? Shouldn't gc get performed on aux1, aux2, aux3... etc as they are no longer needed? – Aeon2058 May 14 '13 at 12:21
  • This doesn't seem accurate. The native C++ code that string.Replace runs is available at https://github.com/fixdpt/shared-source-cli-2.0/blob/master/clr/src/vm/comstring.cpp#L1572. It first iterates the string, finding all of the indices of substrings that will be replaced. Then it allocates exactly the right amount of memory based on that. Then it iterates the string again, copying the original to the new buffer, making replacements where necessary. – Profesor Caos Jun 07 '17 at 12:08
3

Following StringBuilder from MSDN:

Most of the methods that modify an instance of this class return a reference to that same instance, and you can call a method or property on the reference. This can be convenient if you want to write a single statement that chains successive operations.

So when you call replace with String the new object (big data - 36MB) will be allocate to create new string. But StringBuilder accessing same instance objects and does not create new one.

Toan Vo
  • 1,270
  • 9
  • 19
1

There is a concept of memory pressure, meaning that the more temporary objects created, the more often garbage collection runs.

So: StringBuilder creates fewer temporary objects and adds less memory pressure.

StringBuilder Memory

Replace

We next use StringBuilder to replace characters in loops. First convert the string to a StringBuilder, and then call StringBuilder's methods. This is faster—the StringBuilder type uses character arrays internally

Sathish
  • 4,419
  • 4
  • 30
  • 59
0

String is immutable in C#. If you use string.replace() method, the system will create a String object for each replacement. StringBuilder class will help you avoid object creation.

Hieu Tran
  • 150
  • 6