19

This question has been asked many times on StackOverflow but none of them were based on performance.

In Effective Java book it's given that

If String s = new String("stringette"); occurs in a loop or in a frequently invoked method, millions of String instances can be created needlessly.

The improved version is simply the following: String s = "stringette"; This version uses a single String instance, rather than creating a new one each time it is executed.

So, I tried both and found significant improvement in performance:

for (int j = 0; j < 1000; j++) {
    String s = new String("hello World");
}

takes about 399 372 nanoseconds.

for (int j = 0; j < 1000; j++) {
    String s = "hello World";
}

takes about 23 000 nanoseconds.

Why is there so much performance improvement? Is there any compiler optimization happening inside?

Bruno Parmentier
  • 1,219
  • 2
  • 14
  • 34
  • 6
    Erm, *for the exact reason you quoted from the book*. And that's ignoring that your test is unlikely to produce any meaningful results. See: http://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java – Brian Roach Feb 07 '13 at 18:12
  • 6
    You can't test something on 1,000 iterations like that - it will most likely produce results which don't reflect real life... – assylias Feb 07 '13 at 18:12
  • @BrianRoach yes that's true..the source of the confusion was because it's given everywhere that string literals are `object literals`..but i think object literals are **dynamic** not static.. –  Feb 07 '13 at 18:16
  • Why shouldn't they differ in performance, they do completely different things? – Tony Hopkinson Feb 07 '13 at 18:16
  • 1
    @cSharper - there's nothing static about Strings; it's just a pool of instances. Immutable != static. – Brian Roach Feb 07 '13 at 18:21

4 Answers4

41

In the first case, a new object is being created in each iteration, in the second case, it's always the same object, being retrieved from the String constant pool.

In Java, when you do:

String bla = new String("xpto");

You force the creation of a new String object, this takes up some time and memory.

On the other hand, when you do:

String muchMuchFaster = "xpto"; //String literal!

The String will only be created the first time (a new object), and it'll be cached in the String constant pool, so every time you refer to it in it's literal form, you're getting the exact same object, which is amazingly fast.

Now you may ask... what if two different points in the code retrieve the same literal and change it, aren't there problems bound to happen?!

No, because Strings, in Java, as you may very well know, are immutable! So any operation that would mutate a String returns a new String, leaving any other references to the same literal happy on their way.

This is one of the advantages of immutable data structures, but that's another issue altogether, and I would write a couple of pages on the subject.

Edit

Just a clarification, the constant pool isn't exclusive to String types, you can read more about it here, or if you google for Java constant pool.

http://docs.oracle.com/javase/specs/jvms/se7/jvms7.pdf

Also, a little test you can do to drive the point home:

String a = new String("xpto");
String b = new String("xpto");
String c = "xpto";
String d = "xpto";

System.out.println(a == b);
System.out.println(a == c);
System.out.println(c == d);

With all this, you can probably figure out the results of these Sysouts:

false
false
true

Since c and d are the same object, the == comparison holds true.

pcalcao
  • 15,789
  • 1
  • 44
  • 64
  • Do you have a reference to support that? – Aaron Kurtzhals Feb 07 '13 at 18:13
  • @AaronKurtzhals Other than the JLS? [JLS 3.10.5](http://docs.oracle.com/javase/specs/jls/se7/html/jls-3.html#jls-3.10.5) – Brian Roach Feb 07 '13 at 18:14
  • If you want a more manageable source, this answer on SO gives a good explanation/example: http://stackoverflow.com/questions/10209952/java-constant-pool – pcalcao Feb 07 '13 at 18:15
  • I would be satisfied with a reference to the relevant sections in the Java Language Specification. :) – Aaron Kurtzhals Feb 07 '13 at 18:17
  • You shouldn't reference Java 5, especially since a lot has changed with the constant pool in Java 7. – Marko Topolnik Feb 07 '13 at 18:21
  • @MarkoTopolnik You're right, I corrected that by adding the link for the Java7 pdf version. – pcalcao Feb 07 '13 at 18:23
  • @MarkoTopolnik - I went back and forth, chose Java 5 because that particular thing (that all string literals are interned) hasn't changed. It's 50/50 - either someone like you says "don't quote Java5" or the guy asking says "Yeah, but that's Java 7!". Though I'm leaning more toward just always using 7 since EOL on 6 is upon us. – Brian Roach Feb 07 '13 at 18:24
  • @BrianRoach The constant pool used to be off-heap, with Java 7 it's on-heap. That has various consequences, even if possibly not relevant to this discussion (or even mentioned in the specification). I always stick to newer references because that's how things **are** as opposed **used to be**. – Marko Topolnik Feb 07 '13 at 18:35
  • 1
    Note that the string won't be created even the first time: it has already been created at class loading time. It's like a class's "resource". – Marko Topolnik Feb 07 '13 at 18:48
  • i guess i should read the language specs of java to know more about this beautiful language..thxx everybody.. –  Feb 07 '13 at 18:58
4

The performance difference is in fact much greater: HotSpot has an easy time compiling the entire loop

for (int j = 0; j < 1000; j++)
{String s="hello World";}

out of existence so the runtime is a solid 0. This, however, happens only after the JIT compiler kicks in; that's what warmup is for, a mandatory procedure when microbenchmarking anything on the JVM.

This is the code I ran:

public static void timeLiteral() {
  for (int j = 0; j < 1_000_000_000; j++)
  {String s="hello World";}
}
public static void main(String... args) {
  for (int i = 0; i < 10; i++) {
    final long start = System.nanoTime();
    timeLiteral();
    System.out.println((System.nanoTime() - start) / 1000);
  }
}

And this is a typical output:

1412
38
25
1
1
0
0
1
0
1

You can observe the JIT taking effect very soon.

Note that I don't iterate one thousand, but one billion times in the inner method.

Marko Topolnik
  • 195,646
  • 29
  • 319
  • 436
  • the `JIT` is very quick there..wonder if the string literals are really necessary for performance improvement..thxx –  Feb 07 '13 at 18:55
  • @cSharper If you use `new String()`, the performance will be much lower because the constructor call and consequent heap allocation will not be optimized away. `String s="hello World"` is basically a no-op: assigning a constant to an unused local variable, so it's a trivial case for HotSpot. – Marko Topolnik Feb 07 '13 at 19:17
  • Never seen the underscore notation. Is this allowed in Java? – Kirill Rakhman Feb 08 '13 at 20:08
  • @cypressious It's allowed as of Java 7. – Marko Topolnik Feb 08 '13 at 20:10
  • @cypressious, for stuff like 1 billion you can use (int) 1e9 -> more readable imo. – bestsss Feb 11 '13 at 20:32
1

as already have been answered the second retrieves the instance from the String pool (remember Strings are immutable).

Additionally you should check the intern() method which enables you to put new String() into a pool in case you do not know the constant value of the string in runtime: e.g:

String s = stringVar.intern();

or

new String(stringVar).intern();

I will add additional fact, you should know that additionally to the String object more info exist in the pool (the hashcode): this enables fast hashMap search by String in the relevant data Strtuctures (instead of recreating the hashcode each time)

Michael
  • 2,827
  • 4
  • 30
  • 47
0

The JVM maintains a pool of references to unique String objects that are literals. In your new String example you are wrapping the literals with an instance of each.

See http://www.precisejava.com/javaperf/j2se/StringAndStringBuffer.htm

sam
  • 118
  • 5