3

In an application a String is a often used data type. What we know, is that the mutation of a String uses lots of memory. So what we can do is to use a StringBuilder/StringBuffer.

But at what point should we change to StringBuilder?
And what should we do, when we have to split it or to remplace characters in there?

eg:

 //original:
 String[] split = string.split("?");  
 //better? :  
 String[] split = stringBuilder.toString().split("?);

or

 //original:
 String replacedString = string.replace("l","st");  
 //better? :  
 String replacedString = stringBuilder.toString().replace("l","st");  
 //or  
 StringBuilder replacedStringBuilder = new StringBuilder(stringBuilder.toString().replace("l","st);
Neifen
  • 2,546
  • 3
  • 19
  • 31
  • 4
    If you are concerned about such an invasive refactoring and you don't know the difference between a `StringBuilder` and a `StringBuffer` (and probably how exactly they differ from a `String`), you are probably the wrong person to do this refactoring. – Roland Illig Sep 22 '11 at 06:41
  • 2
    it's my application ;) why should somebody other do the refactoring... otherwise where is the learning effect when i don't do it? – Neifen Sep 22 '11 at 06:50
  • Ok then. When you do this refactoring you will have to change a *lot* of the APIs you are using, since many of them require `String`s anyway. You should also tell us a bit more about the strings you want to optimize: are they used only in a small part of the application? which components of the whole system need them, and do they require them to be passed around as `String`s? – Roland Illig Sep 22 '11 at 06:55
  • The mosts of them are needed internal (just in a method on in a class) so they are not needed by other components in the whole system. – Neifen Sep 22 '11 at 07:02

5 Answers5

6

In your examples, there are no benefits in using a StringBuilder, since you use the toString method to create an immutable String out of your StringBuilder.

You should only copy the contents of a StringBuilder into a String after you are done appending it (or modifying it in some other way).

The problem with Java's StringBuilder is that it lacks some methods you get when using a plain string (check this thread, for example: How to implement StringBuilder.replace(String, String)).

What we know, is that a String uses lots of memory.

Actually, to be precise, a String uses less memory than a StringBuilder with equivalent contents. A StringBuilder class has some additional constant overhead, and usually has a preallocated buffer to store more data than needed at any given moment (to reduce allocations). The issue with Strings is that they are immutable, which means Java needs to create a new instance whenever you need to change its contents.

To conclude, StringBuilder is not designed for the operations you mentioned (split and replace), and it won't yield much better performance in any case. A split method cannot benefit from StringBuilder's mutability, since it creates an array of immutable strings as its output anyway. A replace method still needs to iterate through the entire string, and do a lot of copying if replaced string is not the same size as the searched one.

If you need to do a lot of appending, then go for a StringBuilder. Since it uses a "mutable" array of characters under the hood, adding data to the end will be especially efficient.

This article compares the performance of several StringBuilder and String methods (although I would take the Concatenation part with reserve, because it doesn't mention dynamic string appending at all and concentrates on a single Join operation only).

Community
  • 1
  • 1
vgru
  • 49,838
  • 16
  • 120
  • 201
  • 1
    ok What we know ist that the mutation of a String uses lots of memory – Neifen Sep 22 '11 at 06:45
  • +1 for mentioning that `String` uses lesser memory than `StringBuilder`. I missed that in my answer. – user183037 Sep 22 '11 at 06:46
  • 1
    @Neifen: mutation of a String creates a new String. That's why it's inefficient. On the other hand, mutation of a StringBuilder does not create a new StringBuilder. – user183037 Sep 22 '11 at 06:47
2

If you frequently modify the string, go with StringBuilder. Otherwise, if it's immutable anyway, go with String.

To answer your question on how to replace characters, check this out: http://download.oracle.com/javase/tutorial/java/data/buffers.html. StringBuilder operations is what you want.

Here's another good write-up on StringBuilder: http://www.yoda.arachsys.com/csharp/stringbuilder.html

user183037
  • 2,549
  • 4
  • 31
  • 42
  • so a replacement like my second example is a mutation, isn't it. But it is not counterproductive to take `stringBuilder.toString().replace(...);` and save it in a StringBuilder again? – Neifen Sep 22 '11 at 06:31
  • Yes it is. So if you have a lot of replacements like that, best to stick to using `StringBuilder`. If you use `String`, for each mutation, a new `String` object is created - that's inefficient memory management. – user183037 Sep 22 '11 at 06:34
  • so what is the optimal solution? is there a memory-saving possibility to do that? – Neifen Sep 22 '11 at 06:36
  • Or are you saying that you need the object to remain as `String`? Where do you use it? – user183037 Sep 22 '11 at 06:36
  • Build your string with `StringBuilder`. Once you've made all your changes, optionally save it as `String`. It doesn't matter (in most cases) if you retain your object as `StringBuilder`. – user183037 Sep 22 '11 at 06:38
2

What we know, is that the mutation of a String uses lots of memory.

That is incorrect. Strings cannot be mutated. They are immutable.

What you are actually talking about is building a String from other strings. That can use a lot more memory than is necessary, but it depends how you build the string.

So what we can do is to use a StringBuilder/StringBuffer.

Using a StringBuilder will help in some circumstances:

  String res = "";
  for (String s : ...) {
      res = res + s;
  }

(If the loop iterates many times then optimizing the above to use a StringBuilder could be worthwhile.)

But in other circumstances it is a waste of time:

  String res = s1 + s2 + s3 + s4 + s5;

(It is a waste of time to optimize the above to use a StringBuilder because the Java compiler will automatically translate the expression into code that creates and uses a StringBuilder.)

You should only ever use a StringBuffer instead of a StringBuilder when the string needs to be accessed and/or updated by more than one thread; i.e. when it needs to be thread-safe.

But at what point should we change to StringBuilder?

The simple answer is to only do it when the profiler tells you that you have a performance problem in your string handling / processing.

Generally speaking, StringBuilders are used for building strings rather as the primary representation of the strings.

And what should we do, when we have to split it or to replace characters in there?

Then you have to review your decision to use a StringBuilder / StringBuffer as your primary representation at that point. And if it is still warranted you have to figure out how to do the operation using the API you have chosen. (This may entail converting to a String, performing the operation and then creating a new StringBuilder from the result.)

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
1

If you need to lot of alter operations on your String, then you can go for StringBuilder. Go for StringBuffer if you are in multithreaded application.

Vaandu
  • 4,857
  • 12
  • 49
  • 75
  • 2
    StringBuffer is designed to be thread-safe and all public methods in StringBuffer are synchronized. So it can be used in multithreaded app. But use the new `StringBuilder` wherever possible. – Vaandu Sep 22 '11 at 06:34
0

Both a String and a StringBuilder use about the same amount of memory. Why do you think it is “much”?

If you have measured (for example with jmap -histo:live) that the classes [C and java.lang.String take up most of the memory in the heap, only then should you think further in this direction.

Maybe there are multiple strings with the same value. Then, since Strings are immutable, you could intern the duplicate strings. Don't use String.intern for it, since it has bad performance characteristics, but Google Guava's Interner.

Roland Illig
  • 40,703
  • 10
  • 88
  • 121
  • `String` does use lesser memory than `StringBuilder`. – user183037 Sep 22 '11 at 06:44
  • So? I did only say that they use *about* the same amount of memory. I don't have my JVM available right now to check whether a String really needs 8 bytes less than a StringBuilder, but that doesn't seem to be the point here. If the OP is concerned that String already uses too much memory, he should not switch to StringBuilder but look for alternatives. – Roland Illig Sep 22 '11 at 07:00
  • You made an approximate statement and went on to elaborate about something that was barely remotely connected to what the OP was asking. It's obvious from the OP's question that he's new to Java. You're just throwing him off when you talk about interning and Guava's interner when it's clearly not what he's asking about. I wasn't being rude, merely stating a fact. No need to throw a fit. – user183037 Sep 22 '11 at 07:06
  • Ok, now I understood it. I answered the question differently because he is concerned about the memory usage of strings. And interning them helped me a lot once, and it seemed reasonably simple to me. – Roland Illig Sep 22 '11 at 07:15
  • 1
    Thanks, it takes a lot to see both sides of an argument rationally - not a lot of people can do that :) Interning may have come easy to you, but it's definitely not something one wants to use without full knowledge. It's very easy to do it wrong and end up running out of memory because of that. – user183037 Sep 22 '11 at 07:26