4

I've read this answer about how to check if a string is interned in Java, but I don't understand the following results:

String x = args[0]; // args[0] = "abc";
String a = "a";
String y = a + "bc";
System.out.println(y.intern() == y); // true

But if I declare a string literal:

String x = "abc";
String a = "a";
String y = a + "bc";
System.out.println(y.intern() == y); // false

Besides, without any string literal, the args[0] seems to be directly interned:

// String x = "abc";
String y = args[0];
System.out.println(y.intern() == y); // true (???)
// false if the first line is uncommented

Why does y.intern() == y change depending on whether x is a literal or not, even for the example when the command-line argument is used?

I know literal strings are interned at compile time, but I don't get why it affects in the previous examples. I have also read several questions about string interning, like String Pool behavior, Questions about Java's String pool and Java String pool - When does the pool change?. However, none of them gives a possible explanation to this behaviour.

Edit:

I wrongly wrote that in third example the result doesn't change if String x = "abc"; is declared, but it does.

Community
  • 1
  • 1
A. Rodas
  • 20,171
  • 8
  • 62
  • 72
  • 1
    I get false in all cases – aditsu quit because SE is EVIL Feb 23 '13 at 18:26
  • You could get "true" in the second case if the compiler were to "cheat" and combine the 2nd and 3rd assignments (which is at best marginally "legal" given Java's rules). – Hot Licks Feb 23 '13 at 18:43
  • This behavior is kind of scary... – Martijn Courteaux Feb 23 '13 at 18:44
  • 1
    @HotLicks - only if the variable is declared final. – Perception Feb 23 '13 at 18:44
  • The explanation is simple: if you get true, then the string was already interned, otherwise you get false. That is all one can say about it. The answer to "Why is xyz not interned()?" is "Because nobody did it." and "Why is abc interned?" is "Because someone already did it." – Ingo Feb 23 '13 at 18:45
  • 2
    @MartijnCourteaux - Don't know why you say "scary". A given string value may or may not already have been interned -- could have been in a different method an hour ago. Also, `intern()` is not guaranteed to give back the original string even if it's the first such string -- it may give back a copy. – Hot Licks Feb 23 '13 at 18:46
  • @Perception - Outside of Java's strict rules for statement order, there would be no need for "final", and I don't see why "final" would make a difference with Java. It's a simple code optimization "trick" -- copy propagation. `"a" + "bc"` would in normal circumstances be converted to a single (interned) string by javac, and copy propagation replaces `a` with `"a"` during optimization. – Hot Licks Feb 23 '13 at 18:49
  • @HotLicks - I may be a little sandy with my specs, but I do believe section 3.10.5 of the JLS precludes such compiler optimizations for Strings. – Perception Feb 23 '13 at 19:05
  • @Perception - Combining adjacent literals is definitely allowed (and maybe even required). Dunno about the copy propagation, though. – Hot Licks Feb 23 '13 at 19:09

2 Answers2

6

It is because y.intern() gives back y if the string was not interned before. If the string already existed, the call will give back the already existing instance which is most likely different from y.

However, all this is highly implementation dependent so may be different on different versions of the JVM and the compiler.

Henry
  • 42,982
  • 7
  • 68
  • 84
  • Then does it mean that `y.intern() == y` is false when the string is already interned? Doesn't it contradict with [the answer to which I referred at the beginning](http://stackoverflow.com/questions/4883821/java-tell-if-a-string-is-interned/4883828#4883828)? – A. Rodas Feb 23 '13 at 19:04
  • If a string that equals y but is a different instance is already interned you will get false. If the same instance as `y` was previously interned you will get true. You will also get true if no string that equals `y` was interned before. – Henry Feb 23 '13 at 19:12
  • It does contradict the answer, but the code in the answer is wrong. It should be `!=` to check if it was already interned. – Jochen Feb 23 '13 at 19:12
  • @Jochen That depends if the answer means the `myString` instance or just a string that equals `myString` – Henry Feb 23 '13 at 19:14
  • @Henry True. I guess a seemingly simple question like this is impossible to answer with just one line of code without a longish explanation :) – Jochen Feb 23 '13 at 19:36
  • @Henry I've made some test and now I understand it better. Thank you! – A. Rodas Feb 24 '13 at 13:05
0

Implementation details might differ. But this is exactly the behavior I would expect. Your first case means that commandline arguments are not interned by default. Hence y.intern()returns the reference to y after interning it.

The second case is where the VM automatically interns the literal, so that y.intern() returns the reference to x, which is different from y.

And the last case again happens because nothing is interned by default, so the call to intern() returns the reference to y . I believe it is legal to intern String more aggressively, but this is the minimal behavior required by the spec as I understand it.

Jochen
  • 2,277
  • 15
  • 22