8

Beware: I'm not trying to compare if the characters are equals. Because I know how to use the String.equals() method. This question is about String reference

I was studying for the OCA exam when I started to learn about the class String and it properties as immutability, etc. According to what I read or may understand about String pool is that when a string is created Java stores this object on what they call String pool and if a new string is created with the same value it is going to make reference to the string on the String pool except is the case we use the new keyword as this creates a new reference even both string contains the same value.

For example:

String a = "foo";
String b = "foo";
String c = new String("foo");
boolean ab = (a == b); // This return true
boolean ac = (a == c); // This return false

To be clear what this code is making is on the first line of code it will create the String a = "foo" and store this on the String pool, and on the second line of code it will create the String b and reference to "foo" because this already exist on the String pool. But line 3 will create a new reference of this string no matter if this already exist. Here is a little graphic example about what is happening: http://cdn.journaldev.com/wp-content/uploads/2012/11/String-Pool-Java1.png

The problem comes on the followings lines of code. When the string is created by concatenation does java make something different or simple == comparator have another behaviour ?

Example A:

String a = "hello" + " world!";
String b = "hello world!";
boolean compare = (a == b); // This return true

Example B:

a = "hello";
b = "hel" + "lo";
compare = (a == b); // This return true

Example C:

a = "Bye";
a += " bye!";
b = "Bye bye!";
compare = (a == b); // This return false

To watch the code running: (http://ideone.com/fdk6KL)

What is happening ?

EDIT

  1. Fixing error on the Example B: b = 'hel' + 'lo'

  2. Adding clarification about the problem. It's not a comparison problem cause I know the use of String.equals() the problem is on the reference on the String pool

xsami
  • 1,312
  • 16
  • 31
  • String can't compare with "==", how condition is true? – Omore May 18 '17 at 02:21
  • @NathanHughes it's more for personal knowledge. But you got a good point – xsami May 18 '17 at 02:24
  • @John3136. Thank you for the comment. I fixed that error – xsami May 18 '17 at 02:27
  • 4
    The string pool doesn't get added to for every little concatenation - imagine a loop where someone prints `"Some Text" + i + "."` a thousand times. Do you think that deserves thousands of entries in the string pool? – D M May 18 '17 at 02:28
  • "on the first line of code it will create the `String a = "foo"`" is not correct. That line only creates the _reference_ (variable pointing) to the string, which string itself was already in the string pool, whither it was placed upon class initialization. – Lew Bloch May 18 '17 at 02:36
  • 1
    Any epxression that creates a new `String` instance, such as an invocation of `new String(...)` or a concatenation, creates a fresh, non-interned instance of `String`. "string literals - or, more generally, strings that are the values of constant expressions (§15.28) - are "interned" so as to share unique instances, using the method `String.intern`." http://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-3.10.5 Otherwise one must call `intern` explicitly. Read the Fine Manual. – Lew Bloch May 18 '17 at 02:40
  • Thanks @LewBloch I will fix this – xsami May 18 '17 at 02:41
  • 3
    I think you want to learn about the term *compile-time constant*. – chrylis -cautiouslyoptimistic- May 18 '17 at 02:55

2 Answers2

12

"When the string is created by concatenation does java make something different or simple == comparator have another behaviour?"

No it does not change its behavior, what happens is that:

When concatenating two string literals "a" + "b" the jvm joins the two values and then check the string pool, then it realizes the value already exists in the pool so it just simply assign this reference to the String. now in more details:

Look at the compiled bytecode below of this simple program:

public class Test  {    
    public static void main(String... args) {
        String a = "hello world!";
        String b = "hello" + " world!";
        boolean compare = (a == b);
    }
}

Simple program

First the JVM loads the string "hello world! and then push it to string pool (in this case) and then loads it to the stack (ldc = Load constant) [see point 1 in Image]

Then it assign the reference created in the pool to the local variable (astore_1) [see point 2 in Image]

Notice that the reference created in the string pool for this literal is #2 [See point 3 in Image]

The next operation is about the same: in concatenates the string, push it to the runtime constant pool (string pool in this case), but then it realizes a literal with the same content already exists so it uses this reference (#2) and assign in to a local variable (astore_2).

Thus when you do (a == b) is true because both of them are referencing to the string pool #2 which is "hello world!".

Your example C is kind of different tho, because you're using the += operator which when compiled to bytecode it uses StringBuilder to concatenate the strings, so this creates a new instance of StringBuilder Object thus pointing to a different reference. (string pool vs Object)

  • Additional to your code: `final String l = "Hello"; final String d = " Motto"; System.out.println((l+d) == "Hello Motto");` this prints `true` but if we take the keyword `final` to one of the Strings it will print `false`. Thank you for the answer. – xsami May 18 '17 at 15:33
  • 1
    Since both `ldc #2` instructions are in the *compiled* code, it’s the *compiler* that did the string concatenation and constant folding. The JVM sees two `ldc` instructions pointing to the same pool entry (`#2`). It doesn’t have to “realize” that these constants are identical, that’s obvious. At runtime, no string concatenation is happening. The JVM doesn’t even know that the second use of that constant has been a string concatenation at compile time. – Holger Nov 24 '17 at 08:02
  • 1
    Interesting read on the same topic - https://www.quora.com/Java-When-we-concatenate-two-strings-using-the-+-operator-will-the-resulting-string-be-stored-in-the-string-literal-pool-or-not?share=1 – Pavan Kumar Nov 27 '18 at 05:54
  • 2
    The answer concerning "Example C" is incorrect. First: += operator when compiled to bytecode does not use StringBuilder to concatenate strings. It uses StringConcatFactory.makeConcatWithConstants. Second: it returns false because when using += operator, the newly created string is not added to the String Pool. Strings which are created with literal string notation (your double-quoted texts) and strings marked as final variables - get added to String Pool. Concatenated string is created as a new object on the heap. When compared to strings in string pool then they hold different references. – Miko Mar 08 '21 at 13:50
  • Ok thanks for the comment, then I understood that, when the concatenatio is in the same line of declaration, the compiler translates the code to only one literal, to simplify. maybe for us, we have severals variables in the same line, but for the compiles it is only one (remplacing any variable, it is a literal finally), and for this reason, the comparation is true; on the other hand it is not happening when we have the concatenation in differents lines, because it is not a literal, so, this variables will not be located in the pool String. – J. Abel Mar 11 '21 at 04:27
3
String a = "ef";
String b = "cd" + a;
        
System.out.println("cdef"==b); // false
        
String c = "cd" + "ef";
        
System.out.println("cdef"==c); // true

when intern() method is invoked on a String object, It will try to find a String with the same sequence of characters in the pool.

if the String Pool already has a String with the same value then the reference of that String from the Pool is returned, otherwise the string is added to the pool, and the reference is returned.

String concatenation will only intern the strings if the expression is a Constant Expression

"cd" + a // is not a Constant Expression
Yuresh Karunanayake
  • 519
  • 1
  • 4
  • 10