28

I know that Javac compiler is able to transform String concatenation + using StringBuilder/StringBuffer, and I'm curious to know starting from which version this change was introduced?

I'm using this sample code:

public class Main {
  public static void main(String[] args) {
      String a = args[0];
      String s = "a";
      s = s + a;
      s = s + "b";
      s = s + "c";
      s = s + "d";
      s = s + "e";
      System.out.println(s);
  }
}

So far I've tried with javac 1.8.0_121, javac 1.6.0_20, javac 1.5.0_22 and java 1.4.2_19.

Here is a sample of the bytecode I see using javap -c from 1.4.2_19:

6:  astore_2
7:  new #3; //class StringBuffer
10: dup
11: invokespecial   #4; //Method java/lang/StringBuffer."<init>":()V
14: aload_2
15: invokevirtual   #5; //Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
18: aload_1
19: invokevirtual   #5; //Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
22: invokevirtual   #6; //Method java/lang/StringBuffer.toString:()Ljava/lang/String;

All 4 versions seems to be using the StringBuilder/StringBuffer optimization, so I'm curious to know starting from which Javac version this change was introduced?

Nicolas C
  • 1,584
  • 2
  • 17
  • 33
  • Probably since the beginning... – M A Feb 27 '17 at 14:18
  • Java 5 if I'm not mistaken – Maurice Perry Feb 27 '17 at 14:19
  • More useful would be to know if any of them can do this for a string built with a loop. – Boann Feb 27 '17 at 16:52
  • 1
    Note that it isn't always optimized automatically by the compiler. I brought a Linux Server with Java 8 to its knees last year, by using `String#+` instead of `StringBuilder`. I wanted to build a reasonably large (about 300MB) gnuplot file. There was some logic involved, and I had to build two 150MB strings in parallel, and concatenate them at the end. With `String#+`, it took half an hour and all the memory available. With `StringBuilder`, it took just a few seconds and much less memory. – Eric Duminil Feb 27 '17 at 18:35

4 Answers4

32

Here's a quote from the language specification from version 1:

An implementation may choose to perform conversion and concatenation in one step to avoid creating and then discarding an intermediate String object. To increase the performance of repeated string concatenation, a Java compiler may use the StringBuffer class (§20.13) or a similar technique to reduce the number of intermediate String objects that are created by evaluation of an expression.

Back at the time, they had StringBuffer instead of StringBuilder.

Also a quote from StringBuffer of JDK1.0.2:

This Class is a growable buffer for characters. It is mainly used to create Strings. The compiler uses it to implement the "+" operator.

M A
  • 71,713
  • 13
  • 134
  • 174
  • 2
    Nice find, but now I have to despise my former teachers for teaching me to use the uselessly verbose syntax of `StringBuffer`/`Builder` ;-( I'll try to find some solace believing that it might not have been optimized in the early version of the OpenJDK, please nobody ruin this :p – Aaron Feb 27 '17 at 15:49
  • 11
    @Aaron StringBuilder should usually be used when constructing complex strings, e.g. in loops (see http://stackoverflow.com/questions/1532461/stringbuilder-vs-string-concatenation-in-tostring-in-java). – M A Feb 27 '17 at 15:52
  • 1
    Oh, thanks for the precision ! While it's not the point of the question, I think it might still be interesting to add in your answer, as OP and other readers might ignore that this optimisation isn't always possible and could conclude as I erroneously did that using string concatenation everywhere is fine. – Aaron Feb 27 '17 at 16:07
  • @Aaron it's not needlessly verbose. Using `StringBuilder` directly gives you control over the initial buffer size. Estimating it could save you some expensive resizing of the underlying char array. – toniedzwiedz Feb 27 '17 at 19:00
  • @toniedzwiedz thanks for that other specific case where using the `StringBuilder` over string concatenation is interesting ! I however remember, maybe incorrectly, being taught to always use the `StringBuilder` over string concatenation, which would not improve performance in the general case and would impact readability and development speed. I'm not even sure teaching to systematically use `StringBuilder` over string concatenation when the string is constructed in a loop would be a good idea, that sounds like premature optimization and counter-productive for loops with few iterations – Aaron Feb 27 '17 at 19:14
  • 1
    @Aaron I wouldn't call it premature optimization. If a loop does, for example, 10000 concatenations it's equally silly to use `+` as a `StringBuilder` with the default buffer of 16 characters. One case where `+` is actually faster is when concatenating static string values. In such cases the resulting string can be built at compile time. – toniedzwiedz Feb 27 '17 at 19:18
  • @toniedzwiedz what I call premature optimization does not depend on how much performance you can gain, but rather on how much you need to gain. If you're developing an overnight batch which has a 2 hours span to execute, whether your terrible loop takes 15 minutes or not might not matter (memory usage might, though). Now I'd hate to make such code, but I understand the concerns of the manager who wants the code ready for yesterday or the next intern to be able to get up to speed on it quickly. – Aaron Feb 27 '17 at 21:40
  • @toniedzwiedz thanks for the new interesting examples BTW, they add to my new conviction that the answer to `StringBuilder` vs string concatenation is "it depends (if it does matter)" – Aaron Feb 27 '17 at 21:48
  • I wonder why Java didn't define static `String.concat` methods with various numbers of operands as well as a `String[]`? Such things would be much more efficient than `StringBuilder` in all cases where the number of strings to be concatenated is known in advance. – supercat Feb 27 '17 at 23:13
  • No need to dive into the JLS or bytecode...just check the Javadoc for StringBuilder! It says Since: 1.5. So this answer is _not_ the best answer. – Bludzee Mar 28 '17 at 09:56
  • @Bludzee I believe the OP's question is about the `+` optimization in general, regardless of whether it is using `StringBuilder` or `StringBuffer`. Obviously the former came in Java 1.5 – M A Mar 28 '17 at 10:46
11

I have looked up the Java Language Specification, First Edition (from 1996). Not an easy find, but here it is. The passage on concatenation optimization was there even then:

An implementation may choose to perform conversion and concatenation in one step to avoid creating and then discarding an intermediate String object. To increase the performance of repeated string concatenation, a Java compiler may use the StringBuffer class (§20.13) or a similar technique to reduce the number of intermediate String objects that are created by evaluation of an expression.

The specification pertained to StringBuffer then, but StringBuilder (which current JLS wording refers to) might be deemed better performing because its methods are not synchronized.

This, however, does not mean that one should rely on the optimization as always being in place. String concatenation in loops will not get optimized, for example.

lukeg
  • 4,189
  • 3
  • 19
  • 40
5

JLS has already been given in some answers. I just want to make the point that StringBuffer (https://docs.oracle.com/javase/8/docs/api/java/lang/StringBuffer.html) was there since 1.0 whereas

StringBuilder(https://docs.oracle.com/javase/8/docs/api/java/lang/StringBuilder.html) came in version 1.5. Please see the since: section of respective javadocs.

Shubham Chaurasia
  • 2,472
  • 2
  • 15
  • 22
  • No need to dive into the JLS or bytecode...just check the Javadoc for StringBuilder! It says `Since: 1.5`. So this answer is the best answer. (+1) – Bludzee Mar 28 '17 at 09:51
5

This does not answer the question, but I want to merely add to the overall point that in jdk-9 this StringBuilder::append is one of the permitted strategies, but not the default one.

private enum Strategy {
   /**
    * Bytecode generator, calling into {@link java.lang.StringBuilder}.
    */
    BC_SB,

    /**
     * Bytecode generator, calling into {@link java.lang.StringBuilder};
     * but trying to estimate the required storage.
     */
    BC_SB_SIZED,

    /**
     * Bytecode generator, calling into {@link java.lang.StringBuilder};
     * but computing the required storage exactly.
     */
     BC_SB_SIZED_EXACT,

   /**
    * MethodHandle-based generator, that in the end calls into {@link java.lang.StringBuilder}.
    * This strategy also tries to estimate the required storage.
    */
    MH_SB_SIZED,

    /**
     * MethodHandle-based generator, that in the end calls into {@link java.lang.StringBuilder}.
     * This strategy also estimate the required storage exactly.
     */
    MH_SB_SIZED_EXACT,

    /**
     * MethodHandle-based generator, that constructs its own byte[] array from
     * the arguments. It computes the required storage exactly.
     */
     MH_INLINE_SIZED_EXACT
}

It's actually an invokedynamic bytecode for String concatenation, so it's implementation is now JRE specific, not compiler one. The default strategy btw is : MH_INLINE_SIZED_EXACT

Eugene
  • 117,005
  • 15
  • 201
  • 306
  • Sorry but could you explain what *permitted strategies* means? Is it some runtime optimization dedicated to string concatenation? – glee8e Mar 03 '17 at 12:58
  • @glee8e I think I've covered the topic a bit here: http://stackoverflow.com/questions/40267601/how-much-does-java-optimize-string-concatenation-with/42138460#42138460 – Eugene Mar 03 '17 at 13:11