8

I read this Questions about Java's String pool and understand the basic concept of string pool but still don't understand the behavior.

First: it works if you directly assign the value and both s1 and s2 refer to the same object in the pool

String s1 = "a" + "bc";
String s2 = "ab" + "c";
System.out.println("s1 == s2? " + (s1 == s2));

But then if I change the string s1+="d", then the pool should have a string object "abcd"? then when I change the s2+="d", it should find the string object "abcd" in the pool and should assign the object to s2? but it doesn't and they aren't referred to the same object. WHY is that?

String s1 = "abc";
String s2 = "abc";
System.out.println("s1 == s2? " + (s1 == s2));

s1 += "d";                  
s2 += "d";
System.out.println("s1 == s2? " + (s1 == s2));
Community
  • 1
  • 1
Jaxox
  • 960
  • 4
  • 14
  • 25
  • possible duplicate of [Questions about Java's String pool](http://stackoverflow.com/questions/1881922/questions-about-javas-string-pool) – user207421 Jan 24 '13 at 01:09
  • @EJP Asker mentions that very topic and says it didn't answer his questions. – glomad Jan 24 '13 at 01:54

6 Answers6

7

Strings are guaranteed to be pooled when you call String.intern() on a string.

String s1 = "abcd".intern();
String s2 = "abc";
s2 += "d";
s2 = s2.intern();
s1 == s2 // returns true

When compiler sees a constant it's smart enough to optimize and pool the string literal, i.e.:

String s1 = "abcd";
String s2 = "abcd";
s1 == s2 // returns true

Java Language Specification states:

Each string literal is a reference (§4.3) to an instance (§4.3.1, §12.5) of class String (§4.3.3). String objects have a constant value. String literals-or, more generally, strings that are the values of constant expressions (§15.28)-are "interned" so as to share unique instances, using the method String.intern.

So in the case of s2 += "d", compiler wasn't as clever as you are and just pooled "d".

Mirko Adari
  • 5,083
  • 1
  • 15
  • 23
  • 2
    While this answer states a true fact, it doesn't explain the *why* basis of the question. – FThompson Jan 23 '13 at 21:57
  • Interesting, so does the whole string pool thing missing the point? if i need to add .intern() to every single string? – Jaxox Jan 23 '13 at 22:02
  • 5
    @user2005443 You generally shouldn't add `intern()` to every string. That's premature optimization. If there are certain strings which you know (or find out via profiling) are going to be used a lot it could help improve performance, but if you use it indiscriminately it can [be quite dangerous](http://www.codeinstructions.com/2009/01/busting-javalangstringintern-myths.html). – Jeff Jan 23 '13 at 22:14
3

I'm not sure about this, so this is pretty much speculation, but I suspect that there may be some compiler trickery going on in the first example (where it's inline and pretty obvious what's going on), but it's not clever enough to pull it off in the second example (where it's not so obvious).

If I'm right, either the compiler sees "a" + "bc" and simply compresses that down at compile time to "abc" or it's seeing the two lines and pooling the strings because it realizes they will be used. I'm betting on the former..

Not all strings necessarily get pooled.

Jeff
  • 12,555
  • 5
  • 33
  • 60
  • 7
    That's correct, the compiler does constant folding on the compile-time constants and puts just one string, "abc", into the constant pool for the class. I believe all strings that come from the constant pool are interned. Other strings, created at run time, are not interned unless you explicitly call String#intern on them. – David Conrad Jan 23 '13 at 22:00
  • That's indeed what happens. If you check the bytecode, you'll see that "a" + "bc" is translated to a single ldc instruction (pushing a constant), while the other instruction s1 + "d" translates to a call to StringBuilder. – Gothmog Jan 23 '13 at 22:02
2

See the documentation for String#intern(). The last line there states:

All literal strings and string-valued constant expressions are interned.

Your += example is neither a literal string nor a string-valued constant expression, so it is not put in the String pool.

GriffeyDog
  • 8,186
  • 3
  • 22
  • 34
2

The compiler can perform constant evaluation but not in the case where you modify the values

Try instead following and see what happens if you drop final from either variable.

final String s1 = "abc";
final String s2 = "abc";
System.out.println("s1 == s2? " + (s1 == s2));

String s3 = s1 + "d";                  
String s4 = s2 + "d";
System.out.println("s3 == s4? " + (s3 == s4));
Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
0

This is my guess:

String s1 = "a" + "bc"; String s2 = "ab" + "c";

I think that are compile time these are determined to produce the same string and so only one object is made for both.

But when you add "d" to both of them, this is done separately for both strings (since it's done during real time, there could be things like exceptions interrupting it etc, so it can't pre-do it) and so it doesn't automatically make them reference one object.

Patashu
  • 21,443
  • 3
  • 45
  • 53
0

I think what happens here is: 1. for String s1 = "a" + "bc"; String s2 = "ab" + "c"; Java compiler is smart enough to know that the literal value of s1 and s2 are the same, so the compiler points them to the same literal value in the string pool

  1. for s1 += "d";
    s2 += "d";

there is no way the compiler know if s1 and s2 would end up being the same value, At runtime, unless you call String.intern(), jvm won't check the string literal pool to see if the value is already there.

seiya
  • 1,477
  • 3
  • 17
  • 26