127

What is meant by String Pool? And what is the difference between the following declarations:

String s = "hello";
String s = new String("hello");

Is there any difference between the storing of these two strings by the JVM?

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
Saurabh Gokhale
  • 53,625
  • 36
  • 139
  • 164
  • 3
    Closely related: [String `==` vs `.equals` in Java](http://stackoverflow.com/questions/513832/how-do-i-compare-strings-in-java) – Ciro Santilli OurBigBook.com Feb 27 '15 at 06:59
  • 2
    Related topic: [*JEP 192: String Deduplication in G1*](http://openjdk.java.net/jeps/192): “Reduce the Java heap live-data set by enhancing the G1 garbage collector so that duplicate instances of String are automatically and continuously deduplicated.” – Basil Bourque Jun 07 '18 at 22:48

5 Answers5

164

The string pool is the JVM's particular implementation of the concept of string interning:

In computer science, string interning is a method of storing only one copy of each distinct string value, which must be immutable. Interning strings makes some string processing tasks more time- or space-efficient at the cost of requiring more time when the string is created or interned. The distinct values are stored in a string intern pool.

Basically, a string intern pool allows a runtime to save memory by preserving immutable strings in a pool so that areas of the application can reuse instances of common strings instead of creating multiple instances of it.

As an interesting side note, string interning is an example of the flyweight design pattern:

Flyweight is a software design pattern. A flyweight is an object that minimizes memory use by sharing as much data as possible with other similar objects; it is a way to use objects in large numbers when a simple repeated representation would use an unacceptable amount of memory.

Andrew Hare
  • 344,730
  • 71
  • 640
  • 635
  • 17
    Great answer, but it doesn't directly answer the question. From your description, it sounds like the code example would have both reference the same memory, correct? Perhaps you can add a simple summary statement to your answer. – James Oravec Sep 15 '14 at 15:17
  • Incorrect. The code example would use the same interned string literal in both cases, but the 2nd line creates a new Object. If it helps to conceptualize it, think of the 1st line as: `String s = GlobalStringObjectCache.get("hello");` – Charles Goodwin May 30 '16 at 12:07
  • 8
    Copy-pasting an answer from google that doesn't even answer the question should not get so many upvotes – ineedahero Apr 25 '17 at 16:51
61

The string pool allows string constants to be reused, which is possible because strings in Java are immutable. If you repeat the same string constant all over the place in your Java code, you can actually have only one copy of that string in your system, which is one of the advantages of this mechanism.

When you use String s = "string constant"; you get the copy that is in the string pool. However, when you do String s = new String("string constant"); you force a copy to be allocated.

Michael Aaron Safyan
  • 93,612
  • 16
  • 138
  • 200
  • You mean in this way there are two copy of "string constant" in the memory? I know String s = "string constant" will allocate it in the string pool. String s = new String("string constant") will allocate the string to? – liam xu May 09 '16 at 07:43
  • 4
    The second code fragment allocates a new reference to the existing literal in the pool, not a copy. There is only one copy of the literal in memory. – Software Engineer May 27 '16 at 01:37
  • "when you do String s = new String("string constant"); you force a copy to be allocated", could you explain it more detail? what is "copy"? – frank_liu May 10 '21 at 14:14
28

JLS

As mentioned by Andrew, the concept is called "interning" by the JLS.

Relevant passage from JLS 7 3.10.5:

Moreover, a string literal always refers to the same instance of class String. This is because string literals - or, more generally, strings that are the values of constant expressions (§15.28) - are "interned" so as to share unique instances, using the method String.intern.

Example 3.10.5-1. String Literals

The program consisting of the compilation unit (§7.3):

package testPackage;
class Test {
    public static void main(String[] args) {
        String hello = "Hello", lo = "lo";
        System.out.print((hello == "Hello") + " ");
        System.out.print((Other.hello == hello) + " ");
        System.out.print((other.Other.hello == hello) + " ");
        System.out.print((hello == ("Hel"+"lo")) + " ");
        System.out.print((hello == ("Hel"+lo)) + " ");
        System.out.println(hello == ("Hel"+lo).intern());
    }
}
class Other { static String hello = "Hello"; }

and the compilation unit:

package other;
public class Other { public static String hello = "Hello"; }

produces the output:

true true true true false true

JVMS

JVMS 7 5.1 says:

A string literal is a reference to an instance of class String, and is derived from a CONSTANT_String_info structure (§4.4.3) in the binary representation of a class or interface. The CONSTANT_String_info structure gives the sequence of Unicode code points constituting the string literal.

The Java programming language requires that identical string literals (that is, literals that contain the same sequence of code points) must refer to the same instance of class String (JLS §3.10.5). In addition, if the method String.intern is called on any string, the result is a reference to the same class instance that would be returned if that string appeared as a literal. Thus, the following expression must have the value true:

("a" + "b" + "c").intern() == "abc"

To derive a string literal, the Java Virtual Machine examines the sequence of code points given by the CONSTANT_String_info structure.

  • If the method String.intern has previously been called on an instance of class String containing a sequence of Unicode code points identical to that given by the CONSTANT_String_info structure, then the result of string literal derivation is a reference to that same instance of class String.

  • Otherwise, a new instance of class String is created containing the sequence of Unicode code points given by the CONSTANT_String_info structure; a reference to that class instance is the result of string literal derivation. Finally, the intern method of the new String instance is invoked.

Bytecode

It is also instructive to look at the bytecode implementation on OpenJDK 7.

If we decompile:

public class StringPool {
    public static void main(String[] args) {
        String a = "abc";
        String b = "abc";
        String c = new String("abc");
        System.out.println(a);
        System.out.println(b);
        System.out.println(a == c);
    }
}

we have on the constant pool:

#2 = String             #32   // abc
[...]
#32 = Utf8               abc

and main:

 0: ldc           #2          // String abc
 2: astore_1
 3: ldc           #2          // String abc
 5: astore_2
 6: new           #3          // class java/lang/String
 9: dup
10: ldc           #2          // String abc
12: invokespecial #4          // Method java/lang/String."<init>":(Ljava/lang/String;)V
15: astore_3
16: getstatic     #5          // Field java/lang/System.out:Ljava/io/PrintStream;
19: aload_1
20: invokevirtual #6          // Method java/io/PrintStream.println:(Ljava/lang/String;)V
23: getstatic     #5          // Field java/lang/System.out:Ljava/io/PrintStream;
26: aload_2
27: invokevirtual #6          // Method java/io/PrintStream.println:(Ljava/lang/String;)V
30: getstatic     #5          // Field java/lang/System.out:Ljava/io/PrintStream;
33: aload_1
34: aload_3
35: if_acmpne     42
38: iconst_1
39: goto          43
42: iconst_0
43: invokevirtual #7          // Method java/io/PrintStream.println:(Z)V

Note how:

  • 0 and 3: the same ldc #2 constant is loaded (the literals)
  • 12: a new string instance is created (with #2 as argument)
  • 35: a and c are compared as regular objects with if_acmpne

The representation of constant strings is quite magic on the bytecode:

and the JVMS quote above seems to say that whenever the Utf8 pointed to is the same, then identical instances are loaded by ldc.

I have done similar tests for fields, and:

  • static final String s = "abc" points to the constant table through the ConstantValue Attribute
  • non-final fields don't have that attribute, but can still be initialized with ldc

Conclusion: there is direct bytecode support for the string pool, and the memory representation is efficient.

Bonus: compare that to the Integer pool, which does not have direct bytecode support (i.e. no CONSTANT_String_info analogue).

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
  • 2 different objects one is in string pool with abc has two references ie a and b. Another in heap with abc has one references ie c. – Ajay Takur Oct 17 '18 at 14:24
17

String objects are basically wrappers around string literals. Unique string objects are pooled to prevent unnecessary object creation, and the JVM may decide to pool string literals internally. There is also direct bytecode support for String constants which are referenced multiple times, providing the compiler supports this.

When you use a literal, say String str = "abc";, the object in the pool is used. If you use String str = new String("abc");, a new object is created, but the existing string literal may be reused on either the JVM level or bytecode level (at compile time).

You can check this for yourself by creating lots of strings in a for loop and using the == operator to check for object equality. In the following example, string.value is private to String, and holds the string literal used. Because it is private, it has to be accessed via reflection.

public class InternTest {
    public static void main(String[] args) {
        String rehi = "rehi";
        String rehi2 = "rehi";
        String rehi2a = "not rehi";
        String rehi3 = new String("rehi");
        String rehi3a = new String("not rehi");
        String rehi4 = new String(rehi);
        String rehi5 = new String(rehi2);
        String rehi6 = new String(rehi2a);

        String[] arr  = new String[] { rehi, rehi2, rehi2a, rehi3, rehi3a, rehi4, rehi5, rehi6 };
        String[] arr2 = new String[] { "rehi", "rehi (2)", "not rehi", "new String(\"rehi\")", "new String(\"not rehi\")", "new String(rehi)", "new String(rehi (2))", "new String(not rehi)" };

        Field f;
        try {
            f = String.class.getDeclaredField("value");
            f.setAccessible(true);
        } catch (NoSuchFieldException | SecurityException e) {
            throw new IllegalStateException(e);
        }

        for (int i = 0; i < arr.length; i++) {
            for (int j = 0; j < arr.length; j++) {
                System.out.println("i: " +arr2[i]+", j: " +arr2[j]);
                System.out.println("i==j: " + (arr[i] == arr[j]));
                System.out.println("i equals j: " + (arr[i].equals(arr[j])));
                try {
                    System.out.println("i.value==j.value: " + (f.get(arr[i]) == f.get(arr[j])));
                } catch (IllegalArgumentException | IllegalAccessException e) {
                    throw new IllegalStateException(e);
                }
                System.out.println("========");
            }
        }
    }
}

Output:

i: rehi, j: rehi
i==j: true
i equals j: true
i.value==j.value: true
========
i: rehi, j: rehi (2)
i==j: true
i equals j: true
i.value==j.value: true
========
i: rehi, j: not rehi
i==j: false
i equals j: false
i.value==j.value: false
========
i: rehi, j: new String("rehi")
i==j: false
i equals j: true
i.value==j.value: true
========
i: rehi, j: new String("not rehi")
i==j: false
i equals j: false
i.value==j.value: false
========
i: rehi, j: new String(rehi)
i==j: false
i equals j: true
i.value==j.value: true
========
i: rehi, j: new String(rehi (2))
i==j: false
i equals j: true
i.value==j.value: true
========
i: rehi, j: new String(not rehi)
i==j: false
i equals j: false
i.value==j.value: false
========
i: rehi (2), j: rehi
i==j: true
i equals j: true
i.value==j.value: true
========
i: rehi (2), j: rehi (2)
i==j: true
i equals j: true
i.value==j.value: true
========
i: rehi (2), j: not rehi
i==j: false
i equals j: false
i.value==j.value: false
========
i: rehi (2), j: new String("rehi")
i==j: false
i equals j: true
i.value==j.value: true
========
i: rehi (2), j: new String("not rehi")
i==j: false
i equals j: false
i.value==j.value: false
========
i: rehi (2), j: new String(rehi)
i==j: false
i equals j: true
i.value==j.value: true
========
i: rehi (2), j: new String(rehi (2))
i==j: false
i equals j: true
i.value==j.value: true
========
i: rehi (2), j: new String(not rehi)
i==j: false
i equals j: false
i.value==j.value: false
========
i: not rehi, j: rehi
i==j: false
i equals j: false
i.value==j.value: false
========
i: not rehi, j: rehi (2)
i==j: false
i equals j: false
i.value==j.value: false
========
i: not rehi, j: not rehi
i==j: true
i equals j: true
i.value==j.value: true
========
i: not rehi, j: new String("rehi")
i==j: false
i equals j: false
i.value==j.value: false
========
i: not rehi, j: new String("not rehi")
i==j: false
i equals j: true
i.value==j.value: true
========
i: not rehi, j: new String(rehi)
i==j: false
i equals j: false
i.value==j.value: false
========
i: not rehi, j: new String(rehi (2))
i==j: false
i equals j: false
i.value==j.value: false
========
i: not rehi, j: new String(not rehi)
i==j: false
i equals j: true
i.value==j.value: true
========
i: new String("rehi"), j: rehi
i==j: false
i equals j: true
i.value==j.value: true
========
i: new String("rehi"), j: rehi (2)
i==j: false
i equals j: true
i.value==j.value: true
========
i: new String("rehi"), j: not rehi
i==j: false
i equals j: false
i.value==j.value: false
========
i: new String("rehi"), j: new String("rehi")
i==j: true
i equals j: true
i.value==j.value: true
========
i: new String("rehi"), j: new String("not rehi")
i==j: false
i equals j: false
i.value==j.value: false
========
i: new String("rehi"), j: new String(rehi)
i==j: false
i equals j: true
i.value==j.value: true
========
i: new String("rehi"), j: new String(rehi (2))
i==j: false
i equals j: true
i.value==j.value: true
========
i: new String("rehi"), j: new String(not rehi)
i==j: false
i equals j: false
i.value==j.value: false
========
i: new String("not rehi"), j: rehi
i==j: false
i equals j: false
i.value==j.value: false
========
i: new String("not rehi"), j: rehi (2)
i==j: false
i equals j: false
i.value==j.value: false
========
i: new String("not rehi"), j: not rehi
i==j: false
i equals j: true
i.value==j.value: true
========
i: new String("not rehi"), j: new String("rehi")
i==j: false
i equals j: false
i.value==j.value: false
========
i: new String("not rehi"), j: new String("not rehi")
i==j: true
i equals j: true
i.value==j.value: true
========
i: new String("not rehi"), j: new String(rehi)
i==j: false
i equals j: false
i.value==j.value: false
========
i: new String("not rehi"), j: new String(rehi (2))
i==j: false
i equals j: false
i.value==j.value: false
========
i: new String("not rehi"), j: new String(not rehi)
i==j: false
i equals j: true
i.value==j.value: true
========
i: new String(rehi), j: rehi
i==j: false
i equals j: true
i.value==j.value: true
========
i: new String(rehi), j: rehi (2)
i==j: false
i equals j: true
i.value==j.value: true
========
i: new String(rehi), j: not rehi
i==j: false
i equals j: false
i.value==j.value: false
========
i: new String(rehi), j: new String("rehi")
i==j: false
i equals j: true
i.value==j.value: true
========
i: new String(rehi), j: new String("not rehi")
i==j: false
i equals j: false
i.value==j.value: false
========
i: new String(rehi), j: new String(rehi)
i==j: true
i equals j: true
i.value==j.value: true
========
i: new String(rehi), j: new String(rehi (2))
i==j: false
i equals j: true
i.value==j.value: true
========
i: new String(rehi), j: new String(not rehi)
i==j: false
i equals j: false
i.value==j.value: false
========
i: new String(rehi (2)), j: rehi
i==j: false
i equals j: true
i.value==j.value: true
========
i: new String(rehi (2)), j: rehi (2)
i==j: false
i equals j: true
i.value==j.value: true
========
i: new String(rehi (2)), j: not rehi
i==j: false
i equals j: false
i.value==j.value: false
========
i: new String(rehi (2)), j: new String("rehi")
i==j: false
i equals j: true
i.value==j.value: true
========
i: new String(rehi (2)), j: new String("not rehi")
i==j: false
i equals j: false
i.value==j.value: false
========
i: new String(rehi (2)), j: new String(rehi)
i==j: false
i equals j: true
i.value==j.value: true
========
i: new String(rehi (2)), j: new String(rehi (2))
i==j: true
i equals j: true
i.value==j.value: true
========
i: new String(rehi (2)), j: new String(not rehi)
i==j: false
i equals j: false
i.value==j.value: false
========
i: new String(not rehi), j: rehi
i==j: false
i equals j: false
i.value==j.value: false
========
i: new String(not rehi), j: rehi (2)
i==j: false
i equals j: false
i.value==j.value: false
========
i: new String(not rehi), j: not rehi
i==j: false
i equals j: true
i.value==j.value: true
========
i: new String(not rehi), j: new String("rehi")
i==j: false
i equals j: false
i.value==j.value: false
========
i: new String(not rehi), j: new String("not rehi")
i==j: false
i equals j: true
i.value==j.value: true
========
i: new String(not rehi), j: new String(rehi)
i==j: false
i equals j: false
i.value==j.value: false
========
i: new String(not rehi), j: new String(rehi (2))
i==j: false
i equals j: false
i.value==j.value: false
========
i: new String(not rehi), j: new String(not rehi)
i==j: true
i equals j: true
i.value==j.value: true
========
Chris Dennett
  • 22,412
  • 8
  • 58
  • 84
  • String s1 = new String("abc"), String s2 = new String("abc"). s1 != s2, this is because the two object is different. But in memory there are one copy of 'abc' or two? where dose jvm allocate the 'abc' when it's created by constructor. – liam xu May 09 '16 at 07:47
  • In most cases (when the size of the String and the underlying char array are equal), the new String object will have the same underlying char array as the passed String object. So there is one copy of 'abc' in memory (represented as a char array), but two strings using this. – Chris Dennett May 10 '16 at 02:22
  • 1
    This answer is simply wrong, so the upvotes should be removed. The construct `new String("word")` would only create a new string in the pool if there was no string literal in the pool with the same value. It will however create a new String object that references any existing literal in the pool, hence the result of checking for object reference equality. – Software Engineer May 27 '16 at 01:32
  • I clarified the answer. It was correct before, you misread it. – Chris Dennett May 28 '16 at 10:23
8

Its puzzling that no one directly answered the question but most answers have a lot of upvotes.

In a nutshell, the first creates an entry in the String Pool, which can be re-used (more efficient due to above links on immutability, basically, interning), and the second creates a new String object (more costly).

Both objects live in the Heap. The references to both will be in the thread's stack.

http://www.journaldev.com/797/what-is-java-string-pool gives a clear insight into how this is achieved

killjoy
  • 940
  • 1
  • 11
  • 16