3

According to Javadoc about String.intern():

When the intern method is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.

I have few questions about the same.

  1. When a new String object (not using a string literal but using new() operator) is created like: String str = new String("Test");

Question: I am aware that a new object will be created in heap. But will it also put String Test into the stringpool during object creation? If yes, then why the reference is not returned directly for the stringpool. If no, why not directly put the string in the pool as now the StringPool has been moved out of the PermGen and is in regular heap space (i.e. there is no space constraint apart from the heap space limit). There are some posts which state that the String is inserted in pool as soon as object is created whereas there are posts which contradicts this too.

  1. Once we call String.intern() on a String object (as literals are already interned) what happens to the space allocated to the object? Is it reclaimed at the same moment or it waits for the next GC cycle?

  2. Accepted answer to another question on SO, states that String intern should be used when you need speed since you can compare strings by reference (== is faster than equals).

Question: I am aware that when using String.intern() it returns reference to the string already present in the StringPool. But this requires a full scan lookup on the StringPool which can be an expensive operation in itself. So is this speed achieved during string comparison justifiable? If so, why?

I have looked at below sources:

Community
  • 1
  • 1
  • 3
    Why would you expect the string lookup to be expensive? I'd imagine it's a `O(1)` operation, as there's certainly a `HashSet` style structure behind it. – Kayaman Dec 26 '15 at 19:06
  • @Kayaman Okay agreed that the lookup will not be expensive (I should have thought about that earlier I guess). This also explains in a way how it will impact during comparison. Related to answers on this [question](http://stackoverflow.com/questions/14552285/what-is-the-time-complexity-of-equals-in-java-for-2-strings) – Aaditya Gavandalkar Dec 26 '15 at 19:13
  • @VinceEmigh Post that talks about performance of intern: http://stackoverflow.com/questions/10624232/performance-penalty-of-string-intern – Aaditya Gavandalkar Dec 26 '15 at 19:16
  • If you use `new`, Java will _always_ create a new object. – Louis Wasserman Dec 26 '15 at 19:19
  • @LouisWasserman I understand that using ```new``` will cause creation of new object in heap everytime. But I want to know if the string is inserted in the stringpool at the time of object creation or not. Referring question 1 again: If yes, then why the reference is not returned directly for the stringpool. If no, why not directly put the string in the pool as now the StringPool has been moved out of the PermGen and is in regular heap space (i.e. there is no space constraint apart from the heap space limit) – Aaditya Gavandalkar Dec 26 '15 at 19:22
  • 1
    You're passing in `"Test"` to the constructor, and referencing the string literal `"Test"` puts it in the string pool. And then you call `new`, which explicitly asks Java to make a copy and return you that copy instead of the reference in the string pool. – Louis Wasserman Dec 26 '15 at 19:24
  • @LouisWasserman Okay that definitely makes sense. But I am now more curious about then why return a new object. How is it useful? Why not return reference to same object in pool which will speed up comparison and also same heap space as they are immutable? (Unless if somewhere it just needs different reference with same values, but not sure where it might be required) – Aaditya Gavandalkar Dec 26 '15 at 19:29
  • It's not useful. That's why essentially no real code uses that `String` constructor. – Louis Wasserman Dec 26 '15 at 19:30
  • @LouisWasserman Thanks for the explanation :) – Aaditya Gavandalkar Dec 26 '15 at 19:31

1 Answers1

0
  1. All string literals are interned on compilation time. Using a string literal with the single argument constructor taking a string is a bit of an abuse of that constructor, hence you are likely to get two of them (but maybe there is a special compiler case for this, I can't say for sure). As of java 8 the implementation of the constructor (for openjdk) is this:
public String(String original) {
    this.value = original.value;
    this.hash = original.hash;
}

So no special treatment on this side. If you know the literal don't use this constructor.

  1. I don't think there is any special GC semantics for Strings. It will get collected once it's unreachable and deemed collection worthy by the GC as any other object.

  2. Don't ever use == for comparing strings, the first step in the default equals method for Strings is doing just that. If this is your dominant case (you know you are working with interned strings most of the time) you are only paying the overhead of a method call which is tiny, the potential for future bugs you add by doing something like that is just too big of a risk for a gain that is minuscule.

MahdeTo
  • 11,034
  • 2
  • 27
  • 28