0

I have read the following stackoverflow question but there are still a couple of questions unclear.

Let's assume I have the following java class:

class Main {
    public static void main(String[] args) {
        String str = "hello";
    }
}

When a Main.java file is compiled, all references to variables and methods are stored in the class's constant pool as a symbolic reference (stored inside Main.class file). A symbolic reference is a logical reference not a reference that actually points to a physical memory location.

The Class Constant Pool will have Utf8 entry that represents "hello" string literal.

Constant pool:
   #1 = Methodref          #4.#13         // java/lang/Object."<init>":()V
   #2 = String             #14            // hello
   #3 = Class              #15            // Main
   #4 = Class              #16            // java/lang/Object
   #5 = Utf8               <init>
   #6 = Utf8               ()V
   #7 = Utf8               Code
   #8 = Utf8               LineNumberTable
   #9 = Utf8               main
  #10 = Utf8               ([Ljava/lang/String;)V
  #11 = Utf8               SourceFile
  #12 = Utf8               Main.java
  #13 = NameAndType        #5:#6          // "<init>":()V
  #14 = Utf8               hello          /////// CONSTANT_Utf8_info entry ///////
  #15 = Utf8               Main
  #16 = Utf8               java/lang/Object

public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=1, locals=2, args_size=1
         0: ldc           #2                  // String hello
         2: astore_1
         3: return
      LineNumberTable:
        line 3: 0
        line 4: 3

Pay attention to the #14 Constant Pool entry.

When JVM loads and links Main.class file it creates Runtime Class Constant Pool for this class on the Heap in the Method Area (I suppose this is an object representation of Class Constant Pool from .class file). For the Utf8 entry from .class file JVM creates the CONSTANT_Utf8_info object during linking process.

Based on the spec the CONSTANT_Utf8_info object contains bytes array which represents "hello" string. Is this bytes array allocated inside Runtime Class Constant Pool, String Pool or inside Young Generation space of the Heap?

How String Pool data structure looks like in depth? Does it hold references to string literals from Runtime Class Constant Pools or represents hashtable/array of string literals?

Based on the following article String Pool is considered to be a Hashtable<oop, Symbol> but it is not clear what is oop and Symbol and what data structures they represent.

Can you explain what happens internally in JVM when it executes the "ldc #2" line from bytecode above (how memory is allocated and in what space, what is on the stack).

Can you say when "hello" string literal will be accessible for GC?

I would really appreciate if you leave some links to the official JVM specification with some proofs while answering this question.

Eugene Maysyuk
  • 2,977
  • 25
  • 24
  • 2
    Strings in the pool is never GC’d... that’s the point of it – Bohemian Jul 13 '18 at 01:02
  • 1
    [The JVM spec carefully does NOT specify](https://docs.oracle.com/javase/specs/jvms/se10/html/jvms-5.html#jvms-5.1) implementation details, only the semantics that all string literals with the same value are the same object as if by calls to `java.lang.String.intern()`. The implementation has in fact varied over versions of Sun/Oracle Java and OpenJDK (aka HotSpot) -- PermGen has been replaced by Metaspace -- and may be entirely different in other implementations. – dave_thompson_085 Jul 13 '18 at 05:49
  • @Bohemian, I think you are mistaken. Seems like String Pool is collected by GC. Based on the following comment and snippet provided in it https://stackoverflow.com/questions/2431540/garbage-collection-behaviour-for-string-intern/2433076#2433076 – Eugene Maysyuk Jul 15 '18 at 11:02
  • @Eugene I’m not mistaken. The whole point of the String pool is to have *permanent* references to constant Strings. – Bohemian Jul 15 '18 at 19:13
  • @Bohemian, I believe you didn't follow the link I provided in the previous comment because how would you explain that snippet and different hashcode for char array after GC collection. – Eugene Maysyuk Jul 16 '18 at 09:24
  • 1
    @Eugene I did follow that link, but it’s code with no practical value. If you call `intern()` on a String that is a literal string in your code, you will get the literal object back even after you call `gc()`. ie given `static String x = "a";`, calling `new String(new char[]{"a"}).intern();` returns `x` no matter how many times you call `gc()`. If you call `intern()` on a String that’s not a literal, ie there’s no reference to it elsewhere, it’s a candidate for gc. Big deal. Don’t call `intern()` unless you really know what you’re doing (eg consciously implementing flyweight or similar). – Bohemian Jul 16 '18 at 12:58
  • @Bohemian, now it's clear, many thanks. – Eugene Maysyuk Jul 16 '18 at 16:03
  • 1
    @Bohemian that’s only half of the story. Even classes (loaded by a custom class loader) can get garbage collected, which in turn allows even objects associated with string literals to get garbage collected, once no class has a reference to it. So the sentence “Strings in the pool is never GC’d” is misleading at best. Some strings never get garbage collected because they are referenced by code loaded by the application class loader or bootstrap class loader, but it’s not their presence in the pool which is responsible for that. – Holger Jul 17 '18 at 14:58

0 Answers0