81

Short Question

I have a loop that runs 180,000 times. At the end of each iteration it is supposed to append the results to a TextBox, which is updated real-time.

Using MyTextBox.Text += someValue is causing the application to eat huge amounts of memory, and it runs out of available memory after a few thousand records.

Is there a more efficient way of appending text to a TextBox.Text 180,000 times?

Edit I really don't care about the result of this specific case, however I want to know why this seems to be a memory hog, and if there is a more efficient way to append text to a TextBox.


Long (Original) Question

I have a small app which reads a list of ID numbers in a CSV file and generates a PDF report for each one. After each pdf file is generated, the ResultsTextBox.Text gets appended with the ID Number of the report that got processed and that it was successfully processed. The process runs on a background thread, so the ResultsTextBox gets updated real-time as items get processed

I am currently running the app against 180,000 ID numbers, however the memory the application is taking up is growing exponentially as time goes by. It starts by around 90K, but by about 3000 records it is taking up roughly 250MB and by 4000 records the application is taking up about 500 MB of memory.

If I comment out the update to the Results TextBox, the memory stays relatively stationary at roughly 90K, so I can assume that writing ResultsText.Text += someValue is what is causing it to eat memory.

My question is, why is this? What is a better way of appending data to a TextBox.Text that doesn't eat memory?

My code looks like this:

try
{
    report.SetParameterValue("Id", id);

    report.ExportToDisk(ExportFormatType.PortableDocFormat,
        string.Format(@"{0}\{1}.pdf", new object[] { outputLocation, id}));

    // ResultsText.Text += string.Format("Exported {0}\r\n", id);
}
catch (Exception ex)
{
    ErrorsText.Text += string.Format("Failed to export {0}: {1}\r\n", 
        new object[] { id, ex.Message });
}

It should also be worth mentioning that the app is a one-time thing and it doesn't matter that it is going to take a few hours (or days :)) to generate all the reports. My main concern is that if it hits the system memory limit, it will stop running.

I'm fine with leaving the line updating the Results TextBox commented out to run this thing, but I would like to know if there is a more memory efficient way of appending data to a TextBox.Text for future projects.

Hamid Pourjam
  • 20,441
  • 9
  • 58
  • 74
Rachel
  • 130,264
  • 66
  • 304
  • 490
  • 7
    You could try using a `StringBuilder` to append the text then, upon completion, assign the `StringBuilder` value to the textbox. – keyboardP Jan 04 '12 at 20:59
  • @keyboardP I forgot to mention the loop runs on a background thread, and the Results TextBox get updated real-time – Rachel Jan 04 '12 at 21:01
  • 1
    I don't know if it would change anything but what if you would have a StringBuilder which appends the new Id-s and you would use a property which gets updated with the new value of the string builder and Bind this to your textbox.text property. – BigL Jan 04 '12 at 21:01
  • How tight of a loop are you appending the text in? StringBuilder is more efficient for building strings in a loop, etc, but if you are constantly updating a display string with work in between, it won't buy you much. – James Michael Hare Jan 04 '12 at 21:01
  • 2
    Why are you initializing an object array when calling string.Format? There are overloads that take 2 parameters so you can avoid creating an array. Plus when you use the params overload the array is created for you behind the scenes. – ChaosPandion Jan 04 '12 at 21:02
  • @BigL So are you saying that if I append the data to a StringBuilder, and during each iteration you could call `ResultsText.Text = sb.ToString()` and it wouldn't become a memory hog? – Rachel Jan 04 '12 at 21:03
  • 1
    string concat is not necessarily inefficient. If you are concatenating strings across multiple units of work and displaying the results in between each unit of work it will be more efficient than StringBuilder. StringBuilder is really only more efficient when you are building a string through a loop and then only writing out the result at the end of the loop. – James Michael Hare Jan 04 '12 at 21:03
  • If you are doing this in a background thread, I don't see any threadsafe code, or maybe you aren't showing the invoke code? – Darthg8r Jan 04 '12 at 21:03
  • 1
    @Rachel: Yes, doing that with StringBuilder would be a hog because it would create a new temp string every time in addition to the memory held by the StringBuilder. Concatting strings in that case is more efficient. – James Michael Hare Jan 04 '12 at 21:04
  • I don't know if it would or wouldn't just was a wild guess which i would try. :) Another guess would be you could try to use GC to manage your memory usage, i know it is not supposed to be used directly, but this seems to me a special enough case to try it. – BigL Jan 04 '12 at 21:06
  • 1
    Could you compromise and only update the display every 10th (or any _nth_) cycle? – Rick Liddle Jan 04 '12 at 21:07
  • I wouldn't force a GC call unless you need to. Memory growth in .NET is fine as long as she's not getting an OutOfMemoryException. The fact is the concats (and even string builder) will be creating temp memory artifacts when then must be eventually GC'd, unless her machine is running out of memory, there's no gain by forcing a GC. – James Michael Hare Jan 04 '12 at 21:08
  • 1
    @JamesMichaelHare lol no it should be MB. Originally I had it typed out as .5 GB and rewrote it as 500 but forgot to change the unit of measure. – Rachel Jan 04 '12 at 21:12
  • 3
    I was going to say, that's quite the impressive machine :-) – James Michael Hare Jan 04 '12 at 21:14
  • @JamesMichaelHare yes that's true but if the memory usage is growing so rapidly then in time it will hit the memory limit and you will get an OutOfMemoryException. But maybe the problem will be the undo stack of the TextBox. :) – BigL Jan 04 '12 at 21:14
  • @BigL: I agree the undo stack could be the culprit, though to the former point the OutOfMemoryException won't get thrown if the GC collects the memory on its own, forcing it to collect saves you nothing and in fact usually degrades performance. The memory she is seeing isn't actually "lost", it's just unreferenced but not yet collected which is still safe. – James Michael Hare Jan 04 '12 at 21:16
  • 1
    Does this need to be a textbox? I figure a label would work better unless the user needs to be able to type in it. – Hardwareguy Jan 05 '12 at 00:13
  • @ChaosPandion Thanks, I had actually forgotten that `string.format` had other overloads to take multiple arguments. – Rachel Jan 05 '12 at 13:11
  • I do not know if [AppendText](http://msdn.microsoft.com/en-us/library/system.windows.forms.textboxbase.appendtext.aspx) uses the [StringBuilder](http://msdn.microsoft.com/en-us/library/system.text.stringbuilder%28v=VS.100%29.aspx) function to prevent memory issues or not, but it could be easy to test: textBox1.AppendText(string.Format(@"{0}\{1}.pdf", new object[] { outputLocation, id})); –  Jan 04 '12 at 21:17
  • Check out the StringBuilder class :) – Rocklan Jan 04 '12 at 20:58
  • That will work fine if I want to dump all the results out at the end of the procedure, however the process runs in the background and the TextBox is updated real-time as the method runs – Rachel Jan 04 '12 at 21:00
  • 1
    Not related to the question, but to Crystal reports performance. I've found that if you export the report to a stream and then write that stream onto disk, you get much better performance than calling ExportToDisk. I don't know why. – John Oxley Jan 16 '12 at 11:20

12 Answers12

119

I suspect the reason the memory usage is so large is because textboxes maintain a stack so that the user can undo/redo text. That feature doesn't seem to be required in your case, so try setting IsUndoEnabled to false.

casperOne
  • 73,706
  • 19
  • 184
  • 253
keyboardP
  • 68,824
  • 13
  • 156
  • 205
  • 1
    From the MSDN link: "Memory leak If you have a memory increasing in your application because you set value from the code very often, the undo stack of the textblock can be a "leak" of the memory. By using this property you can disable it and clearing the way to leaking memory." – wave Jan 04 '12 at 22:00
  • 33
    The majority of times, users and developers would expect the textbox to function like standard textboxes (i.e. with the ability to undo/redo). In edge cases such as OP's requirements, it can prove to be a hinderance. If the majority of people use it, then it should be default. Why would you expect an edge case to force standard functionality to become opt-in? – keyboardP Jan 04 '12 at 22:22
  • 1
    Alternatively, you can also set the `UndoLimit` to a realistic value. The default of -1 indicates an unlimited stack. Zero (0) also will disable undo. – myermian Jul 18 '12 at 14:35
14

Use TextBox.AppendText(someValue) instead of TextBox.Text += someValue. It's easy to miss since it's on TextBox, not TextBox.Text. Like StringBuilder, this will avoid creating copies of the entire text each time you add something.

It would be interesting to see how this compares to the IsUndoEnabled flag from keyboardP's answer.

Cypher2100
  • 331
  • 1
  • 6
9

Don't append directly to the text property. Use a StringBuilder for the appending, then when done, set the .text to the finished string from the stringbuilder

Darthg8r
  • 12,377
  • 15
  • 63
  • 100
  • 2
    I forgot to mention the loop runs on a background thread and the results get updated real-time – Rachel Jan 04 '12 at 21:01
5

Instead of using a text box I would do the following:

  1. Open up a text file and stream the errors to a log file just in case.
  2. Use a list box control to represent the errors to avoid copying potentially massive strings.
ChaosPandion
  • 77,506
  • 18
  • 119
  • 157
4

Personally, I always use string.Concat* . I remember reading a question here on Stack Overflow years ago that had profiling statistics comparing the commonly-used methods, and (seem) to recall that string.Concat won out.

Nonetheless, the best I can find is this reference question and this specific String.Format vs. StringBuilder question, which mentions that String.Format uses a StringBuilder internally. This makes me wonder if your memory hog lies elsewhere.

**based on James' comment, I should mention that I never do heavy string formatting, as I focus on web-based development.*

Community
  • 1
  • 1
jwheron
  • 2,553
  • 2
  • 30
  • 40
  • I agree, sometimes folks get in the rut of saying "always use X cuz X is best" which is usually an oversimplification. There's a lot of subtlety between string.Concat(), string.Format() and StringBuilder. My rule of thumb is use each what it's intended for (it sounds silly, I know, but it holds true). I use concat when I'm joining strings (and then using the result immediately), I use Format when I'm performing non-trivial string formatting (padding, numeric formats, etc), and StringBuilder for building strings up during a loop to be used at the end of the loop. – James Michael Hare Jan 04 '12 at 21:12
  • @JamesMichaelHare, that makes sense to me; are you suggesting that the use of `string.Format`/`StringBuilder` is more appropriate here? – jwheron Jan 04 '12 at 21:14
  • Oh no, I was just agreeing with your general point that concat is usually best for simple string concats. The problem with "rules of thumb" are that they can change from .NET version to version if the BCL changes, thus sticking with the logically correct construct is more maintainable and usually performs better for its tasks. I actually had an older blog post where I compared the three here: http://geekswithblogs.net/BlackRabbitCoder/archive/2010/05/10/c-string-compares-and-concatenations.aspx – James Michael Hare Jan 04 '12 at 21:17
  • Duly noted -- just wanted to be sure -- and answer edited to qualify my use of the word "always." – jwheron Jan 04 '12 at 21:20
3

Maybe reconsider the TextBox? A ListBox holding string Items will probably perform better.

But the main problem seem to be the requirements, Showing 180,000 items cannot be aimed at a (human) user, neither is changing it in "Real Time".

The preferable way would be to show a sample of the data or a progress indicator.

When you do want to dump it at the poor User, batch string updates. No user could descern more than 2 or 3 changes per second. So if you produce 100/second, make groups of 50.

H H
  • 263,252
  • 30
  • 330
  • 514
  • Thanks Henk. This was a one-time thing so I was being lazy when writing it. I wanted some kind of visual output to know what the status was, and I wanted text selection capabilities and a ScrollBar. I suppose I could have used a ScrollViewer/Label, but TextBoxes have ScrollBarrs built in. I didn't expect it would cause problems :) – Rachel Jan 04 '12 at 21:20
2

Some responses have alluded to it, but nobody has outright stated it which is surprising. Strings are immutable which means a String cannot be modified after it is created. Therefore, every time you concatenate to an existing String, a new String Object needs to be created. The memory associated with that String Object also obviously needs to be created, which can get expensive as your Strings become larger and larger. In college, I once made the amateur mistake of concatenating Strings in a Java program that did Huffman coding compression. When you're concatenating extremely large amounts of text, String concatenation can really hurt you when you could have simply used StringBuilder, as some in here have mentioned.

KyleM
  • 4,445
  • 9
  • 46
  • 78
2

Use the StringBuilder as suggested. Try to estimate the final string size then use that number when instantiating the StringBuilder. StringBuilder sb = new StringBuilder(estSize);

When updating the TextBox just use assignment eg: textbox.text = sb.ToString();

Watch for cross-thread operations as above. However use BeginInvoke. No need to block the background thread while the UI updates.

Steven Licht
  • 760
  • 4
  • 5
1

A) Intro: already mentioned, use StringBuilder

B) Point: don't update too frequently, i.e.

DateTime dtLastUpdate = DateTime.MinValue;

while (condition)
{
    DoSomeWork();
    if (DateTime.Now - dtLastUpdate > TimeSpan.FromSeconds(2))
    {
        _form.Invoke(() => {textBox.Text = myStringBuilder.ToString()});
        dtLastUpdate = DateTime.Now;
    }
}

C) If that's one-time job, use x64 architecture to stay within 2Gb limit.

bohdan_trotsenko
  • 5,167
  • 3
  • 43
  • 70
1

StringBuilder in ViewModel will avoid string rebindings mess and bind it to MyTextBox.Text. This scenario will increase performance many times over and decrease memory usage.

Anatolii Gabuza
  • 6,184
  • 2
  • 36
  • 54
0

You say memory grows exponentially. No, it is a quadratic growth, i.e. a polynomial growth, which is not as dramatic as an exponential growth.

You are creating strings holding the following number of items:

1 + 2 + 3 + 4 + 5 ... + n = (n^2 + n) /2.

With n = 180,000 you get total memory allocation for 16,200,090,000 items, i.e. 16.2 billion items! This memory will not be allocated at once, but it is a lot of cleanup work for the GC (garbage collector)!

Also, bear in mind, that the previous string (which is growing) must be copied into the new string 179,999 times. The total number of copied bytes goes with n^2 as well!

As others have suggested, use a ListBox instead. Here you can append new strings without creating a huge string. A StringBuild does not help, since you want to display the intermediate results as well.

Olivier Jacot-Descombes
  • 104,806
  • 13
  • 138
  • 188
0

Something that has not been mentioned is that even if you're performing the operation in the background thread, the update of the UI element itself HAS to happen on the main thread itself (in WinForms anyway).

When updating your textbox, do you have any code that looks like

if(textbox.dispatcher.checkAccess()){
    textbox.text += "whatever";
}else{
    textbox.dispatcher.invoke(...);
}

If so, then your background op is definitely being bottlenecked by the UI Update.

I would suggest that your background op use StringBuilder as noted above, but instead of updating the textbox every cycle, try updating it at regular intervals to see if it increases performance for you.

EDIT NOTE:have not used WPF.

Daryl Teo
  • 5,394
  • 1
  • 31
  • 37