3

After reading these discussions - question 1, question 2, article

I have the below understanding of Java String Constant Pool (Please correct me, If I am wrong):

When the source code is compiled, compiler look for all the string literals (The ones put into double quotes) in our program and create distinct(No duplicates) objects in the heap area and maintain their references in a special memory area called String Constant Pool (An area inside method area). Any other string objects are created at run time.

Suppose our code has the following statements:

String a = "abc";                  //Line 1
String b = "xyz";                  //Line 2
String c = "abc";                  //Line 3
String d = new String("abc"):      //Line 4

When the above code is compiled,

Line 1: a String object "abc" is created in heap and this object is referenced by variable a and String Constant Pool.

enter image description here

Line 2: Compiler searches String Constant Pool for any existing reference to the object "xyz". But does not find one. So, it creates object "xyz" and puts its reference in String Constant Pool.

enter image description here

Line 3: This time compiler finds the object in String Constant Pool and does not make any additional entry in pool or heap. Variable c just refers to existing object which is also referred by a.

enter image description here

Line 4: The literal in Line 4 is present in String Constant Pool. So, no more entry is made in pool. At run time however another String object is created for "abc" and its reference is stored in variable d.

enter image description here

Now I have the following questions/doubts:

  1. Is that what happens exactly which is described above?
  2. How does the compiler creates object? As per my knowledge, objects are created at Run time and heap is a Run time memory area. So, how and where does String objects are created at the time of compilation!
  3. Source code can be compiled in one machine and run in a different machine. Or, even in the same machine they can be compiled and run in different time. Then how those objects (created in compile time) are recovered?
  4. What happens when we intern a String.
Deb
  • 5,163
  • 7
  • 30
  • 45
  • https://stackoverflow.com/questions/10578984/what-is-string-interning - Of course the string objects are not the _same_ across machines, they are the same in a certain JVM. [The string constant pool _is not_ created "when the source code is compiled."](https://stackoverflow.com/questions/4918399/where-does-javas-string-constant-pool-live-the-heap-or-the-stack) – Salem Jan 11 '18 at 18:45
  • Can you please elaborate? If you know the answer(s), please give in a greater detail. The question linked has only information about string interning. But not how the constant pool is created. @Mango – Deb Jan 11 '18 at 18:51

1 Answers1

2
  1. Is that what happens exactly which is described above?

Yes, conceptually, however, the constant pool and string pool are different things.

The constant pool is a part of a .class file that contains all constants used in this class.

The string pool is a runtime concept - interned strings and string literals are stored here.

Here's the JVM specification on the constant pool. It is part of the section on the .class format.

  1. How does the compiler creates object? As per my knowledge, objects are created at Run time and heap is a Run time memory area. So, how and where does String objects are created at the time of compilation!

How/when exactly this happens, I believe, is a JVM implementation-specific detail (correct me if I am wrong), but the basic explanation is that whenever the JVM decides to load a class, any strings found in the constant pool are automatically placed into the runtime string pool, and any duplicates are made to refer to the same instance.

In one of the linked answers' comments, Paŭlo Ebermann says:

when the classes are loaded in the VM, the string constants will get copied to the heap, to a VM-wide string pool

so it seems this is at least how Sun's VM implemented the string pool.

Prior to JDK 7/HotSpot interned strings were stored in the permanent generation space - now they are stored in the main heap.

  1. Source code can be compiled in one machine and run in a different machine. Or, even in the same machine they can be compiled and run in different time. Then how those objects (created in compile time) are recovered?

Constants are stored in the compiled files. Therefore they are retrievable whenever the JVM decides to load this class.

  1. What happens when we intern a String.

This is answered here:

doing String.intern() on a series of strings will ensure that all strings having same contents share same memory

Salem
  • 13,516
  • 4
  • 51
  • 70
  • If I understood correctly, the string literals are stored in `.class` files similar to any other data. When JVM loads that class, it stores those string data in String Pool. While loading the next class it only puts the non existing Strings into the pool. But, https://stackoverflow.com/a/1881936/3019006 this answer states that the "Java compiler is smart enough to make.." what does that mean? How compiler is coming into the picture if it is done by the JVM? – Deb Jan 11 '18 at 19:26
  • @Deb I think this was a miswording - all the compiler does is place the string into the constant pool, and uses this constant instead of a call to `new String(...)`. The JVM does the job of placing interned strings into the heap. – Salem Jan 11 '18 at 19:44
  • If that is the case. Then suppose I have `A.java` containing a string "abc" and `B.java` also containing "abc. Then, "abc" will be stored in `A.class` or `B.class`, Or, both? If, it stores the string in **constant pool** of both the class file. Then, how the purpose of not creating redundant string is served? – Deb Jan 12 '18 at 04:57
  • 1
    @Deb "abc" in _binary_ would be stored in both `.class` files, but once both classes are loaded, the (runtime) string pool would contain only one "abc" string. Both references to "abc" would be made to refer to this one string. – Salem Jan 12 '18 at 06:26
  • Makes sense. Now If `A.java` has two "abc" string then `A.class` will contain how many String constant, 1 or 2? @Mango – Deb Jan 12 '18 at 17:49
  • 1
    @Deb: if a class contains multiple occurrences of the same string constant, e.g. `"abc"`, they are *usually* compiled to a single entry of the class’ constant pool. The specification does not requires it, but all commonly used compilers do it, as it shortens the class file and reduces the work at runtime. – Holger Jan 16 '18 at 10:17
  • Note that the statement from Paŭlo Ebermann is wrong. In the case of the widely used HotSpot JVM, the one formerly developed by Sun, the `String` instances representing constants are created on their first actual use, rather than class loading, which can be proven as demonstrated by [this answer](https://stackoverflow.com/a/44929935/2711488). In principle, creating them at class loading time would be a valid strategy and perhaps, HotSpot did so, a loooooong time ago… – Holger Jan 16 '18 at 10:46