7

I looked it up in a book, which is usually more thorough in terms of explanations than a website.

Take this for ex.:

if (nickname == "Bob")

The condition will be true only if nickname is referring to the same String object.

Here is a sentence I found confusing, can anyone please explain to why this is the case:

For efficiency, Java makes only one string object for every string constant.

The book points out that the way of assembling the object "Bob" also affects whether the condition will be true of not, which confuses me the most.

For ex.:

String nickname = "Bob";
...
if (nickname == "Bob") //TRUE

But if "Bob" is created from .substring() method, condition will be FALSE.

String name = "Robert";
String nickname = name.substring(0,3);
...
if (nickname == "Rob")//FALSE

Why is this so?

Edit: in the end of the book's explanation, I found a sentence which also confuses me a lot:

Because string objects are always constructed by the compiler, you never have an interest in whether two strings objects are shared.

Doesn't everything we write get constructed by the compiler?

  • Sometimes compilers will optimize strings, so everything that refers to "String", no matter where it is defined, will always be the same "String" string thing. When you use a function, it directly creates a new string, since the compiler isn't smart enough to know it's the same string. – Stephen J Apr 03 '15 at 22:34
  • If you're comparing two strings in java to see if their _contents_ are equivalent, you should **always** use the .equals() method. – maxton Apr 03 '15 at 22:36
  • @TomG The question is less about references than about how Java stores string literals in memory. It is not a duplicate, especially with that. The same goes for the second dupe suggestion. – Shade Apr 03 '15 at 22:37
  • *"Doesn't everything we write get constructed by the compiler?"* Not necessarily, e.g. `String a = "Hello " + "World!"` only adds `"Hello World!"` to the string pool, not `"Hello "` and `"World!"`. – fabian Apr 03 '15 at 22:51
  • @fabian, isn't that what you guys call temporary strings, btw, why use + when what are on both sides strings, isn't that only needed if you want to convert something else to a string? – most venerable sir Apr 04 '15 at 21:18

3 Answers3

10

You need to understand 2 things

1)

String a = "Bob";
String b = "Bob";

System.out.println(a.equals(b));
System.out.println(a == b);

How do you think? What the output?

true
true

What doing this? First string created in string pool in permanent generation memory. Second string get existing object from pool.

String a = "Bob"; // create object in string pool(perm-gen)
String b = "Bob"; // getting existing object.

How right you noticed :

For efficiency, Java makes only one string object for every string constant.

2)

String nickname = name.substring(0,3);

As String is immutable object name.substring(0,3); created new String("Rob") in heap memory, not in perm-gen.

Note :

In Java 8 String pool is created in PermGen area of Heap, garbage collection can occur in perm space but depends upon JVM to JVM. By the way from JDK 1.7 update, String pool is moved to heap area where objects are created.

Read more here.

Community
  • 1
  • 1
Sergey Shustikov
  • 15,377
  • 12
  • 67
  • 119
  • 2
    [What is String literal pool?](http://www.xyzws.com/Javafaq/what-is-string-literal-pool/3) for those who are unfamiliar with pools. – Pétur Ingi Egilsson Apr 03 '15 at 22:38
  • 5
    ...shhh, there is no permgen in Java 8... – Clashsoft Apr 03 '15 at 22:38
  • @Clashsoft, sorry for my mistake. In Java 8 **string pool moved to heap.** I add this to answer for future searches. – Sergey Shustikov Apr 03 '15 at 22:46
  • 2 questions: Q1: (from a website) when creating a object using literal syntax,eg "Bob", a existent object is returned, and if not already existent will be created and store in the pool. Why then do you call b a existent object, if only a is already created?(Isn't Java simply constructing a new String object named b and automatically invoke .intern() to store it in the pool?) Q2: since Permgen is moved to heap,( I am not sure if they are now the same or Permgen is now just a subset of heap, what is a more elaborate/ convincing reason for your 2nd explaination? – most venerable sir Apr 04 '15 at 23:46
2

String literals are internally handled by the JVM so that for every unique String literal, it always refers to the same object if it has the same value. For example, a string literal "test" in class A will be the exact same object as a string literal "test" in class B.

Doesn't everything we write get constructed by the compiler?

The compiler simply adds a the string literal to the classes constant pool upon compilation and loads it with a special instruction called LDC, the rest is handled by the JVM, which loads the string constant from a special string constant pool that never removes / garbage-collects any objects (previously permgen).

However, you can get the 'internal' version of any string (as if it was a string literal) using String#internal(), which would cause the == operator to work again.

Clashsoft
  • 11,553
  • 5
  • 40
  • 79
  • Three questions: Q1: can I say string literal is whatever is enclosed in between ""?(I read some articles but still feel confused). Q2. Doesn't different classes get their own versions/copies of string constant pool?(unless you group them into a project in the compiler?) Q3. So basically, .String#internal() method make the two memory addresses the same, thus condition becomes true? – most venerable sir Apr 04 '15 at 21:26
  • 1
    A1: yep. A2: no, classes have a constant pool that is just used so the bytecode is smaller. It has nothing to do with actual constant pools like the string constant pool. A3: yep. – Clashsoft Apr 04 '15 at 23:05
1

It's about objects.

Since these aren't primitives == doesn't compare what they are. == compares where they are (in heap memory).

.equals() should (if implemented) compare what's contained in that memory.

This is a detail that is easily forgotten because small strings and boxed numbers often don't get new memory when created because it's more optimal to instead point you to cached version of the same thing. Thus you can ask for a new "Bob" over and over and just get handed a reference (memory address) to the same "Bob". This tempts us to compare them like primitives since that seems to work the same way. But not every object will have this happen to it so it's a bad habit to let yourself develop.

This trick works only when 1) a matching object already exists, 2) it's immutable so you can't surprise users of other "copies" by changing it.

To abuse an old metaphor, if two people have the same address it's a safe bet that they keep the same things at home, since it's the same home. However, just because two people have different addresses doesn't mean they don't keep exactly the same things at home.

Implementing .equals() is all about defining what we care about when comparing what is kept in these objects.

So only trust == to compare values of primitives. Use .equals() to ask an object what it think's it's equal to.

Also, this isn't just a java issue. Every object oriented language that lets you directly handle primitives and object references/pointers/memory address will force you to deal with them differently because a reference to an object is not the object it self.

The objects value is not the same as it's identity. If it was there would only ever be one copy of an object with the same contents. Since the language can't perfectly make that happen you're stuck having to deal with these two concepts differently.

candied_orange
  • 7,036
  • 2
  • 28
  • 62