12

I have tried below code:

public class TestIntern {
  public static void main(String[] args) {
   char[] c1={'a','b','h','i'};
   String s1 = new String(c1);
   s1.intern();
   String s2="abhi";
   System.out.println(s1==s2);//true

   char[] c2={'j','a','v','a'};
   String sj1 = new String(c2);
   sj1.intern();
   String sj2="java";
   System.out.println(sj1==sj2);//false

   char[] c3={'J','A','V','A'};
   String tj1 = new String(c3);
   tj1.intern();
   String tj2="JAVA";
   System.out.println(tj1==tj2);//true
  }
}

I have tried many different literals.

Could anyone please explain why intern() doesn't work as expected with literal "java"? Why do the above reference comparisons evaluate to true, except when the literal is "java"?

Mureinik
  • 297,002
  • 52
  • 306
  • 350
  • tried sj.intern() == sc.intern() it is supposed to return true – Marcos Vasconcelos Mar 27 '18 at 21:11
  • You don't compare strings with == – nicomp Mar 27 '18 at 21:11
  • 5
    You are ignoring the return value of `intern()`. Re-read the docs, assign the return value to your original reference, and you'll see it works as you expect. Also, please do not ever depend on `intern()`. – Petr Janeček Mar 27 '18 at 21:11
  • 8
    [`String.intern`](https://docs.oracle.com/javase/10/docs/api/java/lang/String.html#intern()) is not a `void` method, it returns a `String`. You are ignoring the return value. Read the documentation. Do not ignore return values. – Boris the Spider Mar 27 '18 at 21:11
  • Looks like a typo to me; you just didn't reassign `s1` or `sj` or `sj1` anywhere. – Makoto Mar 27 '18 at 21:12
  • 7
    @nicomp The OP wants to compare references in this case. – GriffeyDog Mar 27 '18 at 21:12
  • @MarcosVasconcelos yes, it is supposed. But it is not returning true. I am also surprised. – Abhishek Kumar Mar 27 '18 at 21:13
  • 3
    The answers and comments until now are, as far as I can tell, missing the point of the question. At the **very** least, they don't nearly explain the observed behavior, which is as described, and **very** surprising (for me, at least) – Marco13 Mar 27 '18 at 21:13
  • @MDSayemAhmed That's because "ABCD" is always the reference from the first invokation, while subsequent `new String()` always return a new object. – Petr Janeček Mar 27 '18 at 21:30
  • 5
    Well, people are voting to close this question, as being *"...caused by a problem that can no longer be reproduced or a simple typographical error"*. These people may simply be not nerdy enough for this sort of question :-) – Marco13 Mar 27 '18 at 21:46
  • @AbhishekKumar Petr Janeček essentially gave the answer, and I just "confirmed" it, as far as reasonably possible. (I'm just mentioning this for the case that you accidentally accepted my answer, but it's up to you, of course...) – Marco13 Mar 27 '18 at 21:48
  • 1
    I think the people voting to close just missed the fact that the ignored return value is actually part of the question and not just a mistake. – Radiodef Mar 27 '18 at 22:23
  • 1
    I know this is definitely a dupe, but it's pretty hard to search for it... – shmosel Mar 27 '18 at 22:34

4 Answers4

16

When the JVM first encounters the new String(new char[] {'a', 'b', 'h', 'i'}) string and you call intern() on it, the reference you just created becomes the canonical one and is stored in the string constant pool. Then "abhi" is pulled out from the constant pool - your canonical instance has been reused.

Your problem is that the literal "java" exists in the constant string pool before the start of your program - the JVM simply has it there for some use. Therefore, calling intern() on new String(new char[] {'j', 'a', 'v', 'a'}) does not intern your reference. Instead, it returns the pre-existing canonical value from the constant pool, and you happily ignore the return value.

You should not ignore the return value, but use it. You never know whether your "definitely original" string has not been living in the constant pool since the start of the JVM. Anyway, all of this is implementation dependent, you should either always use the references returned by the intern() method, or never. Do not mix between them.

Petr Janeček
  • 37,768
  • 12
  • 121
  • 145
  • Then how come this program prints `true` on the first line, but `false` for the rest: https://ideone.com/SQiiRf? Not only that, for all subsequent calls to `callMethod` it will print false. – MD Sayem Ahmed Mar 27 '18 at 21:26
  • 4
    @MDSayemAhmed because `new String()` always creates a new reference, it never reaches to the constant pool. The first invocation is the one that gets stored in the constant pool, the other invokations create different objects that are being ignored by the `intern()` method. On the other hand, a `"java"` literal will always return the same object, as far as I know. Uh, is that in the spec, though? I'm honestly not sure. – Petr Janeček Mar 27 '18 at 21:29
  • Yeah, "java" will always return the same interned object, that I can confirm. Thank you for explaining with patience. – MD Sayem Ahmed Mar 27 '18 at 21:35
  • @PetrJaneček Your line:"the literal "java" exists in the constant string pool before the start of your program - the JVM simply has it there for some use" really makes a sense. I tried by re-assigning like "sj=sj.intern()" for 'java' and it worked perfectly fine. Thank you. – Abhishek Kumar Mar 27 '18 at 21:37
3

The answer by Petr Janeček is almost certainly correct (+1 there).

Really proving it is hard, because much of the string pool resides in the JVM itself, and one could hardly access it without a tweaked VM.

But here is some more evidence:

public class TestInternEx
{
    public static void main(String[] args)
    {
        char[] c1 = { 'a', 'b', 'h', 'i' };
        String s1 = new String(c1);
        String s1i = s1.intern();
        String s1s = "abhi";
        System.out.println(System.identityHashCode(s1));
        System.out.println(System.identityHashCode(s1i));
        System.out.println(System.identityHashCode(s1s));
        System.out.println(s1 == s1s);// true

        char[] cj =
        { 'j', 'a', 'v', 'a' };
        String sj = new String(cj);
        String sji = sj.intern();
        String sjs = "java";
        System.out.println(System.identityHashCode(sj));
        System.out.println(System.identityHashCode(sji));
        System.out.println(System.identityHashCode(sjs));
        System.out.println(sj == sjs);// false

        char[] Cj = { 'J', 'A', 'V', 'A' };
        String Sj = new String(Cj);
        String Sji = Sj.intern();
        String Sjs = "JAVA";
        System.out.println(System.identityHashCode(Sj));
        System.out.println(System.identityHashCode(Sji));
        System.out.println(System.identityHashCode(Sjs));
        System.out.println(Sj == Sjs);// true

        char[] ct =
        { 't', 'r', 'u', 'e' };
        String st = new String(ct);
        String sti = st.intern();
        String sts = "true";
        System.out.println(System.identityHashCode(st));
        System.out.println(System.identityHashCode(sti));
        System.out.println(System.identityHashCode(sts));
        System.out.println(st == sts);// false


    }
}

The program prints, for each string, the identity hash code of

  • the string that is created with new String
  • the string that is returned by String#intern
  • the string that is given as a literal

The output is along the lines of this:

366712642
366712642
366712642
true
1829164700
2018699554
2018699554
false
1311053135
1311053135
1311053135
true
118352462
1550089733
1550089733
false

One can see that for the String "java", the hash code of the new String is different from that of the string literal, but that the latter is the same as the one for the result of calling String#intern - which means that String#intern indeed returned a string that is deeply identical to the literal itself.

I also added the String "true" as another test case. It shows the same behavior, because one can assume that the string true will already have appeared before when bootstrapping the VM.

Marco13
  • 53,703
  • 9
  • 80
  • 159
  • Do you know why OpenJDK on Linux prints `true`, `true`, `true`, `false`? I thought that a common word like "java" will be interned. Is there a way to inspect what's inside the String Pool? – Karol Dowbecki Mar 27 '18 at 21:54
  • I have tried with String "false" and it is returning true as expected. @Marco13 Do we have any mechanism to know various strings which have already appeared before when VM is bootstrapped? – Abhishek Kumar Mar 27 '18 at 22:00
  • 2
    @KarolDowbecki Again, it's close to impossible to tell what the JVM is doing internally here. There once was a question about [how to print the whole string pool](https://stackoverflow.com/q/22094111/3182664), and I tried to answer it with some adventurous hack, but really looking at the JVM-internal pool from Java side is almost certainly impossible (my gut feeling is that it could also be a security issue). A **wild** guess: Maybe in the Oracle JVM, the string appears only as part of other strings, like `"java.lang..."`, and not as an individual literal? – Marco13 Mar 27 '18 at 22:00
1

You are not using intern correctly. intern does not modify the string object it's called about (strings are immutable anyway), but returns the canonical representation of that string - which you are just discarding. Instead, you should assign it to a variable and use that variable in your checks. E.g.:

sj1 = sj1.intern();
Mureinik
  • 297,002
  • 52
  • 306
  • 350
1

On OpenJDK 1.8.0u151 and OpenJDK 9.0.4

char[] cj = {'j','a','v','a'};
String sj = new String(cj);
sj.intern();
String sc = "java";
System.out.println(sj == sc); 

prints true. However this == check depends on what Strings has been interned to the String Pool before String sc = "java" is executed. Since compile time String constants are interned by the Java compiler the sc reference now points to "java" in the String Pool which was put there with sj.intern() using s1 reference.

If you try allocating the String "java" before like:

String before = "java"; // interned before by compiler
char[] cj = {'j','a','v','a'};
String sj = new String(cj);
sj.intern();
String sc = "java";
System.out.println(sj == sc);

the code will now print false since sj.intern() will now have no side effects as the "java" String was interned before.

To debug your problem check what's inside the interned String Pool before you reach the failing check. This might depend on your JVM vendor or version.

One would argue that calling intern() just for the side effect of adding the value into the String Pool is pointless. Writing sj = sj.intern() is the right way to intern the String.

Karol Dowbecki
  • 43,645
  • 9
  • 78
  • 111