22

I have a benchmark :

@BenchmarkMode(Mode.Throughput)
@Fork(1)
@State(Scope.Thread)
@Warmup(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS, batchSize = 1000)
@Measurement(iterations = 40, time = 1, timeUnit = TimeUnit.SECONDS, batchSize = 1000)
public class StringConcatTest {

    private int aInt;

    @Setup
    public void prepare() {
        aInt = 100;
    }

    @Benchmark
    public String emptyStringInt() {
        return "" + aInt;
    }

    @Benchmark
    public String valueOfInt() {
        return String.valueOf(aInt);
    }

}

And here is result :

Benchmark                                          Mode  Cnt      Score      Error  Units
StringConcatTest.emptyStringInt                   thrpt   40  66045.741 ± 1306.280  ops/s
StringConcatTest.valueOfInt                       thrpt   40  43947.708 ± 1140.078  ops/s

It shows that concatenating of empty string with integer number is 30% faster than calling String.value(100). I understand that "" + 100 converted to

new StringBuilder().append(100).toString()

and -XX:+OptimizeStringConcat optimization is applied that makes it fast. What I do not understand is why valueOf itself is slower than concatenation. Can someone explain what exactly is happening and why "" + 100 is faster. What magic does OptimizeStringConcat make?

Sotirios Delimanolis
  • 274,122
  • 60
  • 696
  • 724
Dmitriy Dumanskiy
  • 11,657
  • 9
  • 37
  • 57
  • 1
    `"" + 100` is probably much clearer for the compiler to recognize as a constant... – Louis Wasserman Feb 12 '17 at 22:21
  • You could look at the source code for `valueOf`. I'm pretty sure it tries to detect object types, which means autoboxing your int primitive and such – OneCricketeer Feb 12 '17 at 22:22
  • 1
    One of those is a method call. The other, the compiler can compile however it wants. – Louis Wasserman Feb 12 '17 at 22:23
  • The JIT could, but it seems unlikely to eliminate the entire thing. – Louis Wasserman Feb 12 '17 at 22:26
  • 1
    @LouisWasserman it is not compiled to constant. It is compiled to `StringBuider()` construction. – Dmitriy Dumanskiy Feb 12 '17 at 22:27
  • 1
    JIT won't usually kick in on smaller runs very much. What sort of warmup did you give the benchmark? – Lew Bloch Feb 12 '17 at 22:27
  • 1
    @LewBloch you can see it in benchmark. JMH avoids JIT "kicks". @cricket_007 I looked inside `valueOf` but It doesn't answer my question. – Dmitriy Dumanskiy Feb 12 '17 at 22:29
  • 1
    @DmitriyDumanskiy, not necessarily; the compiler documentation is pretty vague about it. [Ask me how I know.](http://openjdk.5641.n7.nabble.com/String-concatenation-tweaks-td219080.html) – Louis Wasserman Feb 12 '17 at 22:30
  • 1
    @LouisWasserman I can say for sure, as constant expression will have millions of scores (ops/s) in benchmark. – Dmitriy Dumanskiy Feb 12 '17 at 22:31
  • 2
    I agree that doesn't appear to be the case here. It's not necessarily true in general, and not necessarily true in future compiler versions, because of the discussion I linked to above. – Louis Wasserman Feb 12 '17 at 22:34
  • @LouisWasserman Thanks for link. Very interesting reading :). – Dmitriy Dumanskiy Feb 12 '17 at 22:52
  • JVM probably does not have an "intrinsic" understanding of `String.valueOf()` - the method isn't used often enough; therefore the method body is executed faithfully every time. – ZhongYu Feb 12 '17 at 22:56
  • 1
    @cricket_007 `String.valueOf()` has numerous overloads, which recognize the argument type at compile-time. There is no autoboxing or runtime object type detection. – user207421 Feb 12 '17 at 23:52

1 Answers1

15

As you've mentioned, HotSpot JVM has -XX:+OptimizeStringConcat optimization that recognizes StringBuilder pattern and replaces it with highly tuned hand-written IR graph, while String.valueOf() relies on general compiler optimizations.

I've found the following key differences by analyzing the generated assembly code:

  • Optimized concat does not zero char[] array created for the result string, while the array created by Integer.toString is cleared after allocation just like any other regular object.
  • Optimized concat translates digits to chars by simple addition of '0' constant, while Integer.getChars uses table lookup with the related array bounds check etc.

There are other minor differences in the implementation of PhaseStringOpts::int_getChars vs. Integer.getChars, but I guess they are not that significant for performance.


BTW, if you take a bigger number (e.g. 1234567890), the performance difference will be negligible because of an extra loop in Integer.getChars that converts two digits at once.

apangin
  • 92,924
  • 10
  • 193
  • 247