1

As the title suggests, is there any reason I shouldn't do the following to append something on to the end of a string:

string someString = "test";
someString += "Test";
astro boy
  • 1,410
  • 1
  • 11
  • 16
  • 3
    For "longer" strings or for "many times" it can do "significantly more work than required". In these cases, see `StringBuilder`. (I find the usage cases .. rare and prefer normal string concatenation or another form of streaming.) –  Jan 15 '13 at 03:03
  • 2
    Check this out: http://www.dotnetperls.com/string-concat – Valamas Jan 15 '13 at 03:07
  • I only reason not to use the above would only be for performance reasons with VERY large stings, otherwise IMO its fine to concatente this way. – sa_ddam213 Jan 15 '13 at 03:08
  • See http://stackoverflow.com/questions/1612797/string-concatenation-vs-string-builder-performance?lq=1 , http://stackoverflow.com/questions/1972983/string-concatenation-vs-string-builder-append (and similar) for comparisons with StringBuilder. –  Jan 15 '13 at 03:10

3 Answers3

6

If done once, it's no big deal. The content of the two strings is taken, a new string is created and assigned to someString. If you do the above operation many times (for example, in a loop), then there is a problem. For each execution, you're allocating a new object in the heap, a new string instance. That's why it's better to use a buffer-based type, such as StringBuilder, if you have to append content to a string multiple times.

e_ne
  • 8,340
  • 32
  • 43
  • 1
    And doing it 1000 times (on "reasonably sized" strings) is still generally no big deal .. 97/3; that is, I only switch to using `StringBuilder` in *specific* cases where I have determined that it is a concern. (An example might be in a template generator.) –  Jan 15 '13 at 03:04
  • "you're allocating a new object in the heap, a new string instance" - this is not the problem in and of itself. .NET can allocate garbage objects quickly. – millimoose Jan 15 '13 at 03:05
  • @millimoose Adding numerous objects on the heap means that the GC cycle will have to execute more often. Am I mistaken somewhere? – e_ne Jan 15 '13 at 03:06
  • 1
    @Eve Yes. This is expected in a normal .NET program. Generational GC is heavily optimised (in fact it's the entire point of the approach) towards dealing with this very very efficiently. – millimoose Jan 15 '13 at 03:07
  • @millimoose Thinking about it, it makes sense, if we assume that the string gets progressively longer -- And its content needs to be copied for each iteration. But I thought that GC pressure is an important factor in why buffers are preferred for many iterations. If you could clarify, I'd be happy to go to bed a little bit more knowledgeable than before. – e_ne Jan 15 '13 at 03:10
  • 2
    +1. @millimoose, GC is fast doing its work, but it is still slower than not doing any work at all. If code looks about the same I see no reasons to explicitly avoid usage of `StringBuilder` (or potentially Stream/Writer classes). – Alexei Levenkov Jan 15 '13 at 03:17
  • 1
    @Eve The copying is far more likely to be a problem, especially if the number of component strings and their length is large. As an example, consider a case where you're concatenating 100 strings, of 10 characters each, and the GC kicks in after every 10 string allocations, where object handles have 8 bytes. The GC only has to examine 10 string references in the youngest generation (80 bytes). However, towards the end of the loop, you have to copy nearly 1000 characters (2000 bytes) on every iteration. – millimoose Jan 15 '13 at 03:18
  • 1
    @Eve My point isn't that GC pressure doesn't exist, but that it doesn't scale with the length of the strings (unlike the cost of copying), so it wouldn't be the dominant contributor to whatever slowdown using `string.Concat()` would cause. Also, talking about GC pressure if you use LINQ in your code **at all** (like most people do) is pretty much crazy, and trying to blindly microoptimise object allocations sounds like missing the forest for a blade of grass. – millimoose Jan 15 '13 at 03:22
  • @millimoose the GC has to examine 10 string references plus whatever other objects are in that generation. Plus, the other generations will be collected more frequently, too. GC pressure may be less significant than copying in most cases, but not necessarily in all. – phoog Jan 15 '13 at 06:07
  • @phoog I was trying to illustrate a point about the GC overhead this code alone contributes, rather than construct an accurate model. (The time needed to collect "whatever other objects" is the fault of whatever code created them.) I made the assumption that running a single garbage object through one-two generations of GC is cheaper than copying an N-character array for sufficiently large N. (One-two generations because even in my example with crazy tiny generations, only 1% of the intermediate strings will live past gen. 2 - with realistic generation sizes this would be negligible.) – millimoose Jan 15 '13 at 21:04
3

If you do the following:

var s = "abc";
s += "def";
s += "ghi";

what (more-or-less) happens is that on line 2, as the new string is being created, the contents of abc are copied into it, then the content of def is copied into it, then this new string (abcdef) is assigned to the variable.

Then on line 3, another new string is created, the contents of the previous value of the variable (abcdef) are copied into it, then ghi, then the result is assigned to the variable.

If you do this repeatedly, you can see that the beginning of the string is being copied again and again on every += operation. In a loop that builds up a long string this could make an impact. This can be optimised by using a StringBuilder or a List<string> with string.Concat(), so that you only ever copy every part of the final result once (or some other constant number independent of the number of loop iterations).

millimoose
  • 39,073
  • 9
  • 82
  • 134
  • Just wanted to say thanks for the explanation, I really do believe that memory copy plays a big (if not bigger) role compared to GC. I hope we can both agree on the fact that it depends on how the appending is done (string appending to itself, new string appending to string, new string appending to new string, all different cases) and on the number of iterations. Still +1 for the precise explanation. – e_ne Jan 15 '13 at 03:56
  • I suppose that `string.Concat()` would do a quicker job of concatenating a list of strings than `string.Join()` would. – phoog Jan 15 '13 at 06:11
  • @phoog That's probably my Python showing; it's a better choice in this context though so I'll edit it in. For what it's worth, I probably use `string.Format()` much much more often than either because it's more convenient and readable (to me), which pretty much loses me performance comparisons by default. – millimoose Jan 15 '13 at 20:53
0

Your approach will work fine. You could use a StringBuilder as an alternative.

var sb = new StringBuilder();
sb.Append("Foo");
sb.Append(" - ");
sb.Append("Bar"); 

string someString = sb.ToString();
Valamas
  • 24,169
  • 25
  • 107
  • 177
jake
  • 1,027
  • 8
  • 9
  • A stringbuilder for me in a situation like this is a bit of more code since the the original string is being passed into the function. – astro boy Jan 15 '13 at 03:13
  • @astroboy - If you're only concatenating two strings then your original approach works well. – jake Jan 15 '13 at 03:14