15

There is code as following:

String s = new String("1");
s.intern();
String s2 = "1";
System.out.println(s == s2);

String s3 = new String("1")+new String("1");
s3.intern();
String s4 = "11";
System.out.println(s3 == s4);

Output of the code above is:

false
true

I know that s and s2 are different objects, so the result evaluates to false, but the second result evaluates to true. Can anyone tell me the difference?

Robby Cornelissen
  • 91,784
  • 22
  • 134
  • 156
bin he
  • 187
  • 6
  • 1
    You may wish to refer [this](http://javatechniques.com/blog/string-equality-and-interning/) – Pratik Ambani Mar 16 '17 at 03:38
  • 3
    Note that calling `s3.intern()` after initializing `s4` changes the output to `false`. This *seems* to indicate that the literal `"11"` is only retrieved from the pool as the line is executed, which isn't how I understood string literal interning to work. – shmosel Mar 16 '17 at 04:31
  • Interestingly, changing to `s = new String("11")` and `s2 = "11"` all of the sudden makes the comparison `s3 == s4` evaluate to false. – Robby Cornelissen Mar 16 '17 at 04:57
  • 1
    @RobbyCornelissen That's essentially the same observation I made. You just moved the literal higher up. – shmosel Mar 16 '17 at 04:59
  • @PratikAmbani That looks like the correct answer to me. You should post it as an answer. – Maybe_Factor Mar 16 '17 at 05:15
  • 2
    @Maybe_Factor It's not an answer at all, it's just a link to an article. – shmosel Mar 16 '17 at 05:26
  • 6
    For users that do not have enough reputation to see the graveyard of deleted answers to this question – if you're planning to write an answer that a) outlines the fundamentals of string interning in Java, or b) suggests that the result of `String.intern()` should be assigned back to the variable: don't bother. The real question is why the behavior in the two cases (`s == s2` vs `s3 == s4`) is different. – Robby Cornelissen Mar 16 '17 at 05:34
  • what's the compiled bytecode look like? – Rogue Mar 16 '17 at 06:42
  • I wonder how this can be unclear at all, even to gold badge owners. It is so obivous that the literal `"1"` has to be loaded when using `new String("1")` and that the String `"11"` will be added to pool using `s3.intern();` and then fetched for `String s4 = "11";`. This is basic stuff like 1+1=2. – Tom Mar 16 '17 at 08:48

4 Answers4

23

Here's what's happening:


Example 1

String s1 = new String("1"); 
s1.intern();
String s2 = "1";
  1. The string literal "1" (passed into the String constructor) is interned at address A.
    String s1 is created at address B because it is not a literal or constant expression.
  2. The call to intern() has no effect. String "1" is already interned, and the result of the operation is not assigned back to s1.
  3. String s2 with value "1" is retrieved from the string pool, so points to address A.

Result: Strings s1 and s2 point to different addresses.


Example 2

String s3 = new String("1") + new String("1");
s3.intern();
String s4 = "11";
  1. String s3 is created at address C.
  2. The call to intern() adds the string with value "11" at address C to the string pool.
  3. String s4 with value "11" is retrieved from the string pool, so points to address C.

Result: Strings s3 and s4 point to the same address.


Summary

String "1" is interned before the call to intern() is made, by virtue of its presence in the s1 = new String("1") constructor call.

Changing that constructor call to s1 = new String(new char[]{'1'}) will make the comparison of s1 == s2 evaluate to true because both will now refer to the string that was explicitly interned by calling s1.intern().

(I used the code from this answer to get information about the strings' memory locations.)

Community
  • 1
  • 1
Robby Cornelissen
  • 91,784
  • 22
  • 134
  • 156
  • Ah, that makes sense now, but it requires very careful reading. Note that the observed behavior changes if any other code has already put `"11"` into the interned string pool before. – Roland Illig Mar 16 '17 at 07:13
  • @RolandIllig Indeed, I noted that in the comments to the question. – Robby Cornelissen Mar 16 '17 at 07:20
  • The steps in your breakdown can be deduced from the output, as I [commented](http://stackoverflow.com/questions/42824821/java-please-help-me-to-understand-these-code-result#comment72761604_42824821) earlier. I'm much more interested in seeing where it's documented, if indeed it is. – shmosel Mar 16 '17 at 07:25
  • @shmosel What would you like to see documented? That the mere use of a string literal (without assigning it) is enough to have it interned in the string pool? – Robby Cornelissen Mar 16 '17 at 07:26
  • That string literals aren't interned until they're used. – shmosel Mar 16 '17 at 07:28
  • @shmosel Define "used". – Robby Cornelissen Mar 16 '17 at 07:29
  • Referenced, assigned, `s4 = "11"` – shmosel Mar 16 '17 at 07:32
  • @shmosel Well, that's exactly my point. They *are* interned without being referenced or assigned. If, for example, you call `System.out.println("a")`, the string `"a"` will be interned` – Robby Cornelissen Mar 16 '17 at 07:36
  • 2
    I meant "used" the same way you did. My point is I would have expected `"a"` to be interned long before that statement executes, possibly at class load. I might have been entirely wrong, but it would be nice to see a source one way or another. – shmosel Mar 16 '17 at 07:40
  • If I understand you correctly, the string that is kept in the pool is the first one of this value that is interned. – Maurice Perry Mar 16 '17 at 07:47
  • @MauricePerry that is indeed what I observed from inspecting the memory addresses. – Robby Cornelissen Mar 16 '17 at 07:49
  • Good answer. Maybe you should stress somewhere that this is (also) why it is a bad idea to use the `String` constructor that takes a string literal (like Effective Java warns). – Kedar Mhaswade Mar 16 '17 at 12:23
  • @shmosel The [JVM Specification, Chapter 5.1 The Run-Time Constant Pool](https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-5.html#jvms-5.1) looks like a good source. There is no obvious single sentence that explains it all, so it's probably a combination of five of the sentences in that chapter that will form the proof. – Roland Illig Mar 16 '17 at 21:29
12

For the scenario 1:

String s = new String("1");
s.intern();
String s2 = "1";
System.out.println(s == s2);

with bytecode:

   0: new           #2                  // class java/lang/String
   3: dup
   4: ldc           #3                  // String 1
   6: invokespecial #4                  // Method java/lang/String."<init>":(Ljava/lang/String;)V
   9: astore_1
  10: aload_1
  11: invokevirtual #5                  // Method java/lang/String.intern:()Ljava/lang/String;
  14: pop
  15: ldc           #3                  // String 1

for String s = new String("1"); it will create a new String object, it will have a new address with "1" that it is already in String Pool:

ldc #3 // String 1

and for s2, as the bytecode:

15: ldc #3 // String 1

s2 is pointing to String Pool variable: "1", so s and s2 have the different address and result is false.

For the scenario 2:

String s3 = new String("1")+new String("1");
s3.intern();
String s4 = "11";
System.out.println(s3 == s4);

with bytecode:

   0: new           #2                  // class java/lang/StringBuilder
   3: dup
   4: invokespecial #3                  // Method java/lang/StringBuilder."<init>":()V
   7: astore_1
   8: aload_1
   9: ldc           #4                  // String 1
  11: invokevirtual #5                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
  14: ldc           #4                  // String 1
  16: invokevirtual #5                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
  19: invokevirtual #6                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
  22: astore_2
  23: aload_2
  24: invokevirtual #7                  // Method java/lang/String.intern:()Ljava/lang/String;
  27: astore_3
  28: ldc           #8                  // String 11

As the bytecode, you can see new String("1")+new String("1"); is created by using StringBuilder

new #2 // class java/lang/StringBuilder

it's totally a new Object without String Pool variable.

and after s3.intern(), this method will add current s3 to the Memory String Pool and 8: aload_1.

and s4 is trying to load from

ldc #8 // String 11

so s3 and s4 address should equal and result is true.

chengpohi
  • 14,064
  • 1
  • 24
  • 42
0

Just for someone who use groovy, the addition info is: the behavior is different

enter image description here

Yu Jiaao
  • 4,444
  • 5
  • 44
  • 57
-3

s.intern() doesn't change the string s. You should have written:

    s = s.intern();
Maurice Perry
  • 9,261
  • 2
  • 12
  • 24
  • So why does it seem to work for `s3.intern()`?. Please read my comment on the original question. – Robby Cornelissen Mar 16 '17 at 07:10
  • @ShyamBaitmangalkar yes, it does. @Robby I guess the compiler evaluates the expression. What I've noticed, is that with `s = s.intern()` and `s3 = s3.intern()`, I obtain true and true. – Maurice Perry Mar 16 '17 at 07:14
  • 1
    But you don't need to reassign `s3` to get `true`, as in OP's example. – shmosel Mar 16 '17 at 07:19
  • @shmosel no, but the documentation says "Returns a canonical representation for the string object.", so the correct way to use .intern(), is to use the returned value (and when done this way, it behaves as expected). That said, I don't know why the original program behaves the way it does. – Maurice Perry Mar 16 '17 at 07:25
  • 2
    We know it's not the correct use. But that's not the question. – shmosel Mar 16 '17 at 07:26