6

Have you ever thought about the implications of this change in the Java Programming Language?

The String class was conceived as an immutable class (and this decision was intentionally thought-out). But String concatenation is really slow, I've benchmarked it myself. So the StringBuffer was born. Really great class, synchronized and really fast. But some people were not happy with the performance cost of some synchronized blocks, and the StringBuilder was introduced.

But, when using String to concatenate not too many objects, the immutability of the class makes it a really natural way to achieve thread-safety. I can understand the use of StringBuffer when we want to manage several Strings. But, here is my first question:

  1. If you have, say, 10 or fewer strings that you want to append, for example, would you trade simplicity for just some milliseconds in execution time?

    I've benchmarked StringBuilder too. It is more efficient than StringBuffer (just a 10% improvement). But, if in your single-threaded program you're using StringBuilder, what happens if you sometimes want to change the design to use several threads? You have to change every instance of StringBuilder, and if you forget one, you'll have some weird effect (given the race condition that may arise) that can be produced.

  2. In this situation, would you trade performance for hours of debugging?

Ok, that's all. Beyond the simple question (StringBuffer is more efficient than "+" and thread-safe, and StringBuilder is faster than StringBuffer but no thread-safe) I would like to know when to use them.

(Important: I know the differences between them; this is a question related to the architecture of the platform and some design decisions.)

Peter O.
  • 32,158
  • 14
  • 82
  • 96
santiagobasulto
  • 11,320
  • 11
  • 64
  • 88
  • 2
    I’m still not entirely sure what your question is, despite the bolded parts. In fact, your text seems to contain the answer to the question asked in the title, and the bolded parts directly contradict each other (i.e. the bold questions contradict the “important” notice at the beginning) and it seems like they have already been answered countless times on Stack Overflow. – Konrad Rudolph Mar 26 '11 at 15:23
  • Konrad, i'd like to know the opinion of experienced coders. I came to this thought today, and want to know if it has any sense. Maybe i'm not asking it correctly (my english is not very good). Sorry for that. Your right, The notice at the top is not good, i've to change it. I would like to get answers for "when to use them" but not choosing about the efficiency (like the other questions i've seen here). – santiagobasulto Mar 26 '11 at 15:28
  • *I've benchmarked StringBuilder too. It is more eficient than StringBuffer (just a 10% avg)* show the benchmark, are you sure you have written it properly, up to the moment I think I have not seen a proper microbenchmark on this site. – bestsss Mar 26 '11 at 15:30
  • It's really simple. Try to append 100,000 String objects one at a time (with a for loop, for example) and you'll se which take more time. I've done this 100 times and made a simple average. – santiagobasulto Mar 26 '11 at 15:32
  • @santiagobasulto, yeah almost certainly a bad microbenchmark. – bestsss Mar 26 '11 at 15:56
  • @bestsss Why? I would like to know how to do it well. – santiagobasulto Mar 26 '11 at 16:06
  • @santiagobasulto, if you wish add the test code to the question and I will reply in details (in few hours, gotta be off for now) – bestsss Mar 26 '11 at 16:26

5 Answers5

9

Just a comment about your "StringBuilders and threads" remark: even in multi-threaded programs, it's very rare to want to build up a string across multiple threads. Typically, each thread will have some set of data and create a string from that, often by concatenating multiple strings together. They'll then convert that StringBuilder to a string, and that string can be safely shared among threads.

I don't think I've ever seen a bug due to a StringBuilder being shared between threads.

Personally I wish StringBuffer didn't exist - it was in the "let's synchronize everything" phase of Java, leading to Vector and Hashtable which have been almost obsoleted by the unsynchronized ArrayList and HashMap classes from Java 2. It just took a little while long for the unsynchronized equivalent of StringBuffer to arrive.

So basically:

  • Use string when you don't want to perform manipulations, and want to be sure nothing else will
  • Use StringBuilder to perform manipulation, usually over a short period
  • Avoid StringBuffer unless you really, really need it - and as I say, I can't remember ever seeing a situation where I'd use StringBuffer instead of StringBuilder, when both are available.
Paŭlo Ebermann
  • 73,284
  • 20
  • 146
  • 210
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • Great Jon! Thanks! That was i was looking for. I've been coding an aplication that has this little problem (several threads sharing a simple String) but is ONE weird and uncommon example. Great comparision with the collections! I've never thought about it. You made a point. – santiagobasulto Mar 26 '11 at 15:37
  • the biggest complaint i have with StringBuffer vs. StringBuilder is they are not related thru some interface, or abstract superclass. – MeBigFatGuy Mar 26 '11 at 15:44
  • @MeBigFatGuy, use Appendable and CharSequence? – bestsss Mar 26 '11 at 15:55
  • @santiagobasulto: Sharing a *string* is fine - but sharing a `StringBuilder/StringBuffer` is the uncommon situation. – Jon Skeet Mar 26 '11 at 16:01
  • @Jon, yes jon, that was what i meant. Several threads get the same instance of a StringBuffer and work with it. – santiagobasulto Mar 26 '11 at 16:07
  • @santiagobasulto: And the work is fine to happen with only a single operation synchronized at a time? That's the problem with self-synchronizing classes - they can only synchronize a single operation, which isn't usually what you want. – Jon Skeet Mar 26 '11 at 16:10
8

StringBuffer was in Java 1.0; it was not any kind of a reaction to slowness or immutability. It's also not in any way faster or better than string concatenation; in fact, the Java compiler compiles

String s1 = s2 + s3;

into something like

String s1 = new StringBuilder(s2).append(s3).toString();

If you don't believe me, try it yourself with a disassembler (javap -c, for example.)

The thing about "StringBuffer is faster than concatenation" refers to repeated concatenation. In that case explicitly creating yoir own StringBuffer and using it repeatedly performs better than letting the compiler create many of them.

StringBuilder was introduced in Java 5 for performance reasons, as you say. The reason it makes sense is that StringBuffer/Builder are virtually never shared outside of the method that creates them: 99% of their usage is something like the above, where they're created, used to append a few strings together, then discarded.

Ernest Friedman-Hill
  • 80,601
  • 10
  • 150
  • 186
5

Nowadays both StringBuffer and Builder are sort of useless (from performance point of view). I explain why:

StringBuilder was supposed to be faster than StringBuffer but any sane JVM can optimize away the synchronization. So it was quite a huge miss (and small hit) when it was introduced.

StringBuffer used NOT to copy the char[] when creating the String (in non shared variant); however that was a major source of issues, incl leaking huge char[] for small Strings. In 1.5 they decided that a copy of the char[] must occur every time and that practically made StringBuffer useless (the sync was there to ensure no thread games can trick out the String). That conserves memory, though and ultimately helps the GC (beside the obviously reduced footprint), usually the char[] is the top3 of the objects consuming memory.

String.concat was and still is the fastest way to concatenate 2 strings (and 2 only... or possibly 3). Keep that in mind, it does not perform an extra copy of the char[].

Back to the useless part, now any 3rd party code can achieve the same performance as StringBuilder. Even in java1.1 I used to have a class name AsycnStringBuffer which did exactly the same what StringBuilder does now, but still it allocates larger char[] than StringBuilder. Both StrinBuffer/StringBuilder are optimized for small Strings by default you can see the c-tor

  StringBuilder(String str) {
    super(str.length() + 16);
    append(str);
    }

Thus if the 2nd string is longer than 16chars, it gets another copy of the underlying char[]. Pretty uncool.

That can be a side effect of attempt at fitting both StringBuilder/Buffer and the char[] into the same cache line (on x86) on 32bit OS... but I don't know for sure.

As for the remark of hours of debugging, etc. Use your judgment, I personally do not recall ever having any issues w/ strings operations, aside impl. rope alike structure for the sql generator of JDO impl.


Edit: Below I illustrate what java designers didn't do to make String operations faster. Please, note that the class is intended for java.lang package and it can put there only by adding it to the bootstrap classpath. However, even if not put there (the difference is a single line of code!), it'd be still faster than StringBuilder, shocking? The class would have made string1+string2+... a lot better than using StringBuilder, but well...

package java.lang;

public class FastConcat {

    public static String concat(String s1, String s2){
        s1=String.valueOf(s1);//null checks
        s2=String.valueOf(s2);

        return s1.concat(s2);
    }

    public static String concat(String s1, String s2, String s3){
        s1=String.valueOf(s1);//null checks
        s2=String.valueOf(s2);
        s3=String.valueOf(s3);
        int len = s1.length()+s2.length()+s3.length();
        char[] c = new char[len];
        int idx=0;
        idx = copy(s1, c, idx);
        idx = copy(s2, c, idx);
        idx = copy(s3, c, idx);
        return newString(c);
    }
    public static String concat(String s1, String s2, String s3, String s4){
        s1=String.valueOf(s1);//null checks
        s2=String.valueOf(s2);
        s3=String.valueOf(s3);
        s4=String.valueOf(s4);

        int len = s1.length()+s2.length()+s3.length()+s4.length();
        char[] c = new char[len];
        int idx=0;
        idx = copy(s1, c, idx);
        idx = copy(s2, c, idx);
        idx = copy(s3, c, idx);
        idx = copy(s4, c, idx);
        return newString(c);

    }
    private static int copy(String s, char[] c, int idx){
        s.getChars(c, idx);
        return idx+s.length();

    }
    private static String newString(char[] c){
        return new String(0, c.length, c);
        //return String.copyValueOf(c);//if not in java.lang
    }
}
bestsss
  • 11,796
  • 3
  • 53
  • 63
  • 2
    A few things: 1. The JVM can (AFAIK) only optimise out synchronisation for local variables (escape analysis) so when using nonlocal variables, `StringBuilder` will still be faster than `StringBuffer`. 2. Using `StringBuilder` *documents* the fact that you’re not caring about synchronisation. When used right, this makes code easier to understand. 3. Your posting makes it sound as if `StringBuilder` is never faster than concatenation. Your `concat` code is all fine but it solves a different problem than a `StringBuilder`. I doubt that it will be faster in a tight loop with a lot of concats. – Konrad Rudolph Mar 28 '11 at 08:57
  • 1). Not really - it depends how much the JVM inlines,i.e. how much the object can possibly escape, also uncontended sync is close to free as well, nowadays (it does not inflate the object header). 2) That's not really a performance concern (I myself tend to look at the code, mostly). and 3) the posting doesn't say so, the sample code is provided to be used in case where you get>. String s="xxx: "+someInt+" yyy"+anotherString; That would beat any StringBuilder whatsoever... back to the point. A lot of impl. by java1.5 have dropped StringBuffer b/c it didn't help for shared (more than 1) calls. – bestsss Mar 28 '11 at 11:25
  • @Konrad, what, perhaps, is unclear: as of now, both StringBuilder/Buffer take zero benefits of being put in java.lang package. Thus, a custom impl. form java1.1 performs as quick as StringBuilder. Before java1.5 StringBuffer used to try and play smart not to copy the char[] to avoid the allocation (and some minor GC) costs. That's ok, however not adding a few simple concat methods (like the snipped above) was not a smart move. String.concat was and is always faster than StringBuilder/Buffer for 2 strings. You can't even come closer (unless the JVM can use intrisics) – bestsss Mar 28 '11 at 11:38
1

I tried the same thing on an XP machine. the StringBuilder IS somewhat faster but if You reverse the order of the run, or make several runs You'll notice that the "almost factor two" in the results will be changed into something like 10% advantage:

StringBuffer build & output duration= 4282,000000 µs
StringBuilder build & output duration= 4226,000000 µs
StringBuffer build & output duration= 4439,000000 µs
StringBuilder build & output duration= 3961,000000 µs
StringBuffer build & output duration= 4801,000000 µs
StringBuilder build & output duration= 4210,000000 µs

For Your kind of test the JVM will NOT help out. I had to limit the number of runs and elements just to get ANY result from a "String only"-test.

Peter O.
  • 32,158
  • 14
  • 82
  • 96
Martin Sjöblom
  • 141
  • 1
  • 8
0

Decided to put the options to the test with a simple composition of XML exercise. Testing done on a 2.7GHz i5 with 16Gb DDR3 RAM for those wishing to replicate results.

Code:

   private int testcount = 1000; 
   private int elementCount = 50000;

   public void testStringBuilder() {

    long total = 0;
    int counter = 0;
    while (counter++ < testcount) {
        total += doStringBuilder();
    }
    float f = (total/testcount)/1000;
    System.out.printf("StringBuilder build & output duration= %f µs%n%n", f); 
}

private long doStringBuilder(){
    long start = System.nanoTime();
    StringBuilder buffer = new StringBuilder("<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n");
    buffer.append("<root>");
      for (int i =0; i < elementCount; i++) {
          buffer.append("<data/>");
      }
      buffer.append("</root>");
     //System.out.println(buffer.toString());
      output = buffer.toString();
      long end = System.nanoTime();
     return end - start;
}


public void testStringBuffer(){
    long total = 0;
    int counter = 0;
    while (counter++ < testcount) {
        total += doStringBuffer();
    }
    float f = (total/testcount)/1000;

    System.out.printf("StringBuffer build & output duration= %f µs%n%n", f); 
}

private long doStringBuffer(){
    long start = System.nanoTime();
    StringBuffer buffer = new StringBuffer("<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n");
    buffer.append("<root>");
      for (int i =0; i < elementCount; i++) {
          buffer.append("<data/>");
      }
      buffer.append("</root>");
     //System.out.println(buffer.toString());
      output = buffer.toString();

      long end = System.nanoTime();
      return end - start;
}

Results:

On OSX machine:

StringBuilder build & output duration= 1047.000000 µs 

StringBuffer build & output duration= 1844.000000 µs 


On Win7 machine:
StringBuilder build & output duration= 1869.000000 µs 

StringBuffer build & output duration= 2122.000000 µs

So looks like performance enhancement might be platform specific, dependant on how JVM implements synchronisation.

References:

Use of System.nanoTime() has been covered here -> Is System.nanoTime() completely useless? and here -> How do I time a method's execution in Java?.

Source for StringBuilder & StringBuffer here -> http://www.java2s.com/Open-Source/Java-Document/6.0-JDK-Core/lang/java.lang.htm

Good overview of synchronising here -> http://www.javaworld.com/javaworld/jw-07-1997/jw-07-hood.html?page=1

Community
  • 1
  • 1
binarycube
  • 121
  • 4
  • 1
    Thank you for your answer, but I wasn't looking for a benchmark. My question was regarding the architecture of the Java plataform. Anyway, you should really take a look at your test/benchmark becouse StringBuilder should be faster in a single-thread enviroment (don't want to be rude, but it's not a good benchmark). Also memory would be something good to test, to see if there are some auxiliary constructs used by StringBuffer or StringBuilder. I recommend you to make a single method, becouse the "concat" method is the same for all clases i.e: has the same API. – santiagobasulto Dec 19 '11 at 01:01
  • I agree StringBuilder ***should*** be faster, and a similar test here -> http://littletutorials.com/2008/07/16/stringbuffer-vs-stringbuilder-performance-comparison/ does show that (also done in jre6). The question is why - perhaps an underlying architectural reason ? Further testing on a Windows 7 machine with 1Gb RAM yielded the expected results (ie StringBuilder faster) (previously tests were on an OSX machine)... perhaps different implementations utilising available memory differently, perhaps the host OS plays an important role as well? – binarycube Dec 21 '11 at 00:46
  • No, the reason my friend is that your test is not good! If you have a minute read the introduction of this chapter from Dive into Python (It's python, not Java, sorry). http://www.diveintopython.net/performance_tuning/timeit.html – santiagobasulto Dec 21 '11 at 02:03
  • Kudos to @santiagobasulto for pointing out single value test is meaningless - should have known better... but it looked like it was working OK ... ahh well... – binarycube Dec 21 '11 at 08:07