1

I was creating a demo to demonstrate how much faster StringBuilder is than String concatenation. I also wanted to demonstrate that String concatenation is the equivalent to making a new StringBuilder for each append. I wanted to demonstrate this with the following 2 blocks:

Simply concatenate a String

String result = "";
for (int i = 0; i < 200000; i++) {
    result += String.valueOf(i);
}

Create a new StringBuilder for each concatenation

String result = "";
for (int i = 0; i < 200000; i++) {
    result = new StringBuilder(result).append(String.valueOf(i)).toString();
}

My understanding was that the String concatenation example compiled to the same thing as creating a new StringBuilder for each concatenation. I even decompiled the code to see the same result for both (I know decompiled code is not necessarily an exact representation of the written or compiled code).

Obviously using a single StringBuilder for all appending ran in a fraction of the time of these 2 examples. But the weird thing is that the 2nd example (creating a new StringBuilder manually each time) ran at nearly half the time of simple concatenation example. Why is creating a new StringBuilder each time so much faster than concatenation?

(Using Java Adoptium Temurin 17 in case it matters)

Update For those asking for full code-

import java.util.Date;

public class App {
    public static void main(String[] args) {

        String result = "";
        Date start = new Date();
        for (int i = 0; i < 200000; i++) {
            result = new StringBuilder(result).append(String.valueOf(i)).toString();
            //result += String.valueOf(i);
        }
        Date end = new Date();
       System.out.println(end.getTime() - start.getTime());
    }
}

Like I said, it's a simple example. You can obviously just comment out the StringBuilder line and uncomment the concat line to test the other example.

As I mentioned below, I know Date is not meant to be used for metrics like this but it's a relatively simple example, and even without it, the difference is significantly noticeable in person.

I've tried this on 3 different Java Runtimes (Oracle 8, AdoptOpenJDK 11, and Adoptium Temurin 17), and while each one takes different amounts of time, the difference between the 2 is still pretty similar.

cbender
  • 2,231
  • 1
  • 14
  • 18
  • 3
    In Java 17, this is unlikely to be the case; Java 17 uses something _much_ smarter than "compiling to the same as creating a new StringBuilder." See [JEP 280](https://openjdk.java.net/jeps/280) for details. That said, it seems likely that your measurement technique is problematic. – Louis Wasserman Nov 18 '21 at 16:41
  • @LouisWasserman I double checked, maven compile target and source are 17 as well as the runtime. Just using old-fashioned `new Date()` for start and end and subtracting the difference. But even if that wasn't accurate, the difference is noticeable (it's printing when each one is done). – cbender Nov 18 '21 at 16:45
  • 2
    It is _entirely possible_ for that measurement technique to actively lie to you. – Louis Wasserman Nov 18 '21 at 16:46
  • 1
    there has been _many_ questions of this type. The general rule still stands, loop concatenation should use `StringBuilder`. And use `JMH` for measuring – Eugene Nov 18 '21 at 16:46
  • @LouisWasserman I know that's not the best super accurate way but it's literally almost double the time. Even if I stand here and manually time it, I can easily note the time difference. It's taking way longer to do the concatenation. – cbender Nov 18 '21 at 16:50
  • @Eugene obviously `StringBuilder` should be used. The point of this question isn't why `StringBuilder` is faster than concatenation, it's why String concatenation so much slower than creating a **new `StringBuilder` for each append**. – cbender Nov 18 '21 at 16:51
  • I have a sample project where I sometimes run jhm things in github. [here are the results](https://github.com/wind57/0x-github-jmh-runner/runs/4254472271?check_suite_focus=true#step:4:113) for [your exact sample](https://github.com/wind57/0x-github-jmh-runner/blob/686c497472df9fa3991a232497b2bbd6b199e414/src/jmh/java/zero/x/so/ConcatSample.java). You can increase the `@Fork` count and warmups to see more accurate results, but the point stands : "it's how you measure". – Eugene Nov 18 '21 at 17:20
  • @Eugene thanks, similar enough time between the 2, not nearly as drastic as mine, although the avg of the concat is still longer. But based on what everyone is saying, it should be faster if using Java 9+. I tried 8, 11, & 17 and the % time difference was pretty similar across so no idea what's going on. I asked the question because obviously this behavior isn't expected... – cbender Nov 18 '21 at 17:37
  • as said - change the `@Fork`, increase measurements and warmups and you will get much closer numbers. And those results do no show anything "drastic", do they? Yet again, and for the last time : it's how you measure. – Eugene Nov 18 '21 at 17:48

1 Answers1

4

You haven't shown us much about your measurement technique, but it's almost certain that's the culprit; correct benchmarking in Java is very hard. Most notably, Java code is always slow at startup, and that scales with the complexity with the code, even if the steady-state performance is very different.

In any event, it's not true that string concatenation compiles to the same thing as your StringBuilder code -- certainly not anymore. JEP 280 in Java 9 changed that compilation strategy to something much more effective for long-running applications, but that notably incurs some slowdown at startup, which your benchmark may not have accounted for.

Louis Wasserman
  • 191,574
  • 25
  • 345
  • 413
  • Just using old-fashioned `new Date()` for start and end and subtracting the difference. But even if that wasn't accurate, the difference is noticeable (it's printing when each one is done). – cbender Nov 18 '21 at 16:47
  • @cbender: A [mcve] would be more useful - but I agree with Louis that it's almost certain that the problem is in the test construction rather than in the behavior of Java itself. The benefit of seeing the exact code you're running is that we'd be more likely to be able to work out why you're seeing the results you are. – Jon Skeet Nov 18 '21 at 16:49
  • @JonSkeet that is the exact code I'm running. This isn't part of any existing application code, was just creating a simple example. I'm also just using `new Date()` before and after to compare. But even though `new Date()` isn't a great method for measurement, the difference between the 2 is physically noticeable, it's that much longer. – cbender Nov 18 '21 at 17:03
  • 1
    What Jon means is you should show us your entire program, not just these loops. – Louis Wasserman Nov 18 '21 at 17:25
  • @LouisWasserman @JonSkeet I updated the question. Like I said, it's very simple. And yes, I know not to use `Date` for things like this usually, but this is a simple example and even without it, the difference is physically noticeable in person, especially when using larger numbers. Also listed all of the Java runtimes I tried. – cbender Nov 18 '21 at 18:02
  • It’s not like I’m using a faulty metric and it’s a few seconds off. When raised to 400000 the builder took a minute and the concat took a full 2-3 minutes (AdoptOpenJDK 11). Don’t need exact precision to notice that difference. – cbender Nov 18 '21 at 18:22