2

I know that using the "+" concatenation operator for building strings is very inefficient, and that is why it is recommended to use the StringBuilder class, but I was wondering if this kind of pattern is inefficient too?

String some = a + "\t" + b + "\t" + c + "\t" + d + "\t" + e;

I guess here the compiler will optimize the assignment fine, or not?

MetallicPriest
  • 29,191
  • 52
  • 200
  • 356
  • 5
    That's fine. The compiler can optimise it. See also [How is String concatenation implemented in Java 9?](https://stackoverflow.com/questions/46512888/how-is-string-concatenation-implemented-in-java-9) – khelwood Feb 11 '19 at 15:02
  • 1
    I would be cautious saying that @khelwood, we don't know that `a`/`b`/etc are all constants (which is where it would optimize to a literal). Otherwise the only optimization we might see here is the compiler literally swapping the concatenation with `StringBuilder` all on its own. In a loop, that could be a lot of unecessary stringbuilders that are resolved. – Rogue Feb 11 '19 at 15:04
  • @Rogue I don't understand what problem you are describing. If the question is "Can I write a one-line string declaration instead of explicitly using a StringBuilder, and rely on the compiler to figure out how to perform it?" and you're saying the compiler *might* have to fall back on using a StringBuilder, then the code is fine. What is the alternative? – khelwood Feb 11 '19 at 15:07
  • @khelwood in the case of a loop, constructing the `StringBuilder` outside of the loop and appending to it without creating a string from it every iteration. It's moreso a question of scope at that point, but I think there's a duplicate here for an SO question about concat optimizations in java. Still haven't found it just yet though – Rogue Feb 11 '19 at 15:09
  • @Rogue In **other situations** an explicit StringBuilder is faster. But the code in the question isn't appending to a string in a loop. It's catting a bunch of stuff together in one line. – khelwood Feb 11 '19 at 15:10
  • Sure with just that singular line alone, it's just a case to consider (especially if anyone stumbled on this page from google). E.g. if that line _were_ in a loop. – Rogue Feb 11 '19 at 15:12
  • 1
    "optimize to a literal" wasn't in the problem statement. The compiler will optimize string concatenation to `StringBuilder.append` calls within a single expression involving non-constant terms. As pointed out, it won't with concatenations on different lines, although the JIT/HotSpot runtime compiler might do something. – Lew Bloch Feb 11 '19 at 20:43
  • It depends if a,b,c,d are constants.... the compiler will optimize it if it can resolve the values (at compile time) – mnesarco Feb 11 '19 at 15:04
  • Why they have to be constants? Why there would be a problem with normal variables? StringBuilder would work fine with variables too, right. Some other optimization may not work, but what kind of optimization are we talking about then? – MetallicPriest Feb 11 '19 at 15:06
  • The optimization he's thinking of is where the compiler will replace a concatenation between constants with just the literal value. There are other optimizations as well. – Rogue Feb 11 '19 at 15:07
  • Yes, StringBuilder is a kind of optimization . But constant expressions can be fully optimized to a literal value. So it depends of what kind of optimization are you thinking of. – mnesarco Feb 11 '19 at 15:11
  • Can you explain that further? Explanations help others to learn from your answer – Nico Haase Feb 12 '19 at 11:44

3 Answers3

6

Your premise “that using the "+" concatenation operator for building strings is very inefficient”, is not correct. First, string concatenation itself is not a cheap operation, as it implies creating a new string containing all concatenated strings, hence, needing to copy the character contents. But this does always apply, regardless of how you do it.

When you use the + operator, you’re telling what you want to do, without saying how to do it. Not even the Java Language Specification demands a particular implementation strategy, except that the concatenation of compile-time constants must be done at compile time. So for compile-time constants, the + operator is the most efficient solution¹.

In practice, all commonly used compilers from Java 5 to Java 8 generate code using a StringBuilder under the hood (before Java 5, they used StringBuffer). This applies to statements like yours, so replacing it with a manual StringBuilder use would not gain much. You could be slightly better than the typical compiler generated code by providing a reasonable initial capacity, but that’s all.

Starting with Java 9, compilers generate an invokedynamic instruction which allows the runtime to provide the actual code performing the concatenation. This could be a StringBuilder code similar to the one used in the past, but also something entirely different. Most notably, the runtime provided code can access implementation specific features, which the application code could not. So now, the string concatenation via + can be even faster than StringBuilder based code.

Since this applies to a single concatenation expression only, when performing a string construction using multiple statements or even a loop, using a StringBuilder consistently during the entire construction may be faster than the multiple concatenation operations. However, since the code runs in an optimizing environment, with a JVM recognizing some of these patterns, not even that can be said for sure.

This is the time to remember the old rule, to only try to optimize performance, when there is an actual problem with the performance. And always verify with impartial measuring tools, whether an attempted optimization truly improves the performance. There are a lot of widespread myths, wrong or outdated, about performance optimization tricks.

¹ except you have repeated parts and want to reduce the size of the class file

Holger
  • 285,553
  • 42
  • 434
  • 765
  • 2
    thank you! I had no time to post an answer since I *really* dislike the accepted one and I also have no idea how saying that at the bytecode level there is a `StringConcatFactory::makeConcatWithConstants` actually answers it. – Eugene Feb 12 '19 at 14:12
5

This particular example will be inlined by the compiler:

String a = "a";
String b = "bb";
String c = "ccc";
String some = a + "\t" + b + "\t" + c;

Java 9+ will inline this using invokedynamic with makeConcatWithConstants making it efficient. As per javap -v output:

Code:
  stack=3, locals=5, args_size=1
     0: ldc           #2                  // String a
     2: astore_1
     3: ldc           #3                  // String bb
     5: astore_2
     6: ldc           #4                  // String ccc
     8: astore_3
     9: aload_1
    10: aload_2
    11: aload_3
    12: invokedynamic #5,  0              // InvokeDynamic #0:makeConcatWithConstants:(Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;)Ljava/lang/String;
    17: astore        4
    19: return

However if the a b and c are compile time constants compiler will further optimize the code:

final String a = "a";
final String b = "bb";
final String c = "ccc";
String some = a + "\t" + b + "\t" + c;

and some will be loaded with a constant value:

Code:
  stack=1, locals=5, args_size=1
     0: ldc           #2                  // String a
     2: astore_1
     3: ldc           #3                  // String bb
     5: astore_2
     6: ldc           #4                  // String ccc
     8: astore_3
     9: ldc           #5                  // String a\tbb\tccc
    11: astore        4
    13: return

In other circumstances e.g. for loop the compiler might not be able to produce optimized code so StringBuilder might be faster.

Karol Dowbecki
  • 43,645
  • 9
  • 78
  • 111
  • 3
    Wanted to just re-stress the point that this only applies to literal/constant concatenation. If `a`/etc is a variable that was, say, passed in, we lose that optimization (and it'll likely just be replaced with a `StringBuilder` call, from what I've seen in the past). – Rogue Feb 11 '19 at 15:06
  • @KarolDowbecki I have no idea how this answers the question. `StringConcatFactory` has just two methods: `makeConcat` and `makeConcatWithConstants`. Now what those method will *actually* use at runtime is an entire different story. – Eugene Feb 11 '19 at 16:39
  • @Rogue how about you try to pass `a` as an argument to a method and decompile that? It is still going to be `makeConcatWithConstants`... – Eugene Feb 11 '19 at 16:40
2

In general case, string concatenation with + and using StringBuilder is absolute correct and working. But in different situations concatenation with + becomes less efficient, than using StringBuilder.

String concatenation NOT IN LOOP - EFFICIENT!!!

This makes good performance, because JVM transforms this using StringBuilder.

String some = a + "\t" + b + "\t" + c + "\t" + d + "\t" + e;

This is OK, because JVM internally change this code to the following one:

String some = new StringBuilder().append(a).append('\t').append(c).append('\t')
                                 .append(d).append('\t').append(e).toString();

P.S. StringBuilder has internal buffer char[]. In case you know how long will be result string, then it's better to reserve whole buffer in the beginning. E.g. in case of final string will be at most 1024 characters, then you could do new StringBuilder(1024)

String concatenation IN LOOP - NOT EFFICIENT!!!

This makes bad performance, because JVM cannot wrap while loop with one StringBuilder, like this:

StringBuilder buf = new StringBuilder();

for (int i = 0; i < 10; i++)
    buf.append(a).append('\t').append(c).append('\t')
       .append(d).append('\t').append(e).append('t');

String some = buf.toString();

but JVM still able to optimize all concatenations within each loop iterations; like this:

String some = "";

for (int i = 0; i < 10; i++) {
    some = new StringBuilder(some).append(a).append('\t').append(c).append('\t')
                               .append(d).append('\t').append(e).append('t');
}

As you can see, ther're some disadvantages of using string concatenation in loop.

Oleg Cherednik
  • 17,377
  • 4
  • 21
  • 35