2

I've read questions on SO explaining that Java automatically interns String literals, and obviously interns when intern() is called. However, I am wondering if in a loop (a foreach loop in my case) the Strings are also automatically interned. I ask because I am interning Strings to save memory in several very large LinkedHashMaps and want to know if calling intern() is redundant.

Example:

String array[] = createLargeArbitraryArray();
Map<String, Float> map = new LinkedHashMap<String, Float>(array.length);
for (String s : array)
  map.put(s, 1f);
}

Would the next example have any difference in memory usage than the first (assuming same array values), or is s already interned at that point?

String array[] = createLargeArbitraryArray();
Map<String, Float> map = new LinkedHashMap<String, Float>(array.length);
for (String s : array)
  map.put(s.intern(), 1f);
}

I realize that in this case, there may be no equivalent Strings, but in my case I have a custom key for several maps that does use duplicate String values in the end, and I am wondering if calling intern() to save memory would be redundant or not at this point.

To add to this question, if an array was created from literals (in which case they are interned initially), then passed to a method as an argument, would a loop as above use interned Strings or not? i.e.:

/**
 * @param array created from literals
 */
public void populateMap(String[] array) {
  for (String s : array)
    map.put(s, 1f); // Interned or not?
}

In this case, either s is interned because array was interned to begin with, or s is not interned because it is declared as a new object in the loop parameters. Which is correct?

EDIT: To further explain my reasons for interning, I want to free up memory in the heap to avoid hitting the max heap size or GC overhead limit. Performance and speed are not extremely important to me in this case.

17slim
  • 1,233
  • 1
  • 16
  • 21
  • 1
    Hmm... I'm skeptical about `intern`ing strings yourself. Have you measured an actual performance benefit? Oftentimes, it can actually hurt performance. – PC Luddite Aug 09 '16 at 15:13
  • I have not yet. I am willing to sacrifice some performance/speed to avoid hitting heap or GC overhead limits, however. My main reason for interning is memory, not performance. – 17slim Aug 09 '16 at 15:14
  • Then I suppose in this case if you wanted your strings to be interned, then you should call it explicitly. They won't automatically be interned. – PC Luddite Aug 09 '16 at 15:16
  • This has very little to do with loops and most to do with references and argument evaluation. Please read [What is the difference between a variable, object, and reference?](http://stackoverflow.com/questions/32010172/what-is-the-difference-between-a-variable-object-and-reference) and [Is Java “pass-by-reference” or “pass-by-value”?](http://stackoverflow.com/questions/40480/is-java-pass-by-reference-or-pass-by-value) – Sotirios Delimanolis Aug 09 '16 at 15:27
  • @SotiriosDelimanolis While I do see where part of the question is related to those concepts, I am also asking whether Java will automatically intern in those locations; I'm asking if it will *do* something rather than if it will *pass* something. – 17slim Aug 09 '16 at 15:39

1 Answers1

2

Explicitly calling intern() is the right approach here (or like PC luddite says, check if you really need interning). I don't know of any JIT compilers in the JVM that do that (even if there is some optimization that does that, you shouldn't rely on it).

If an array is created from interned strings, and then passed to a method and then used, then the "interned" strings will be used. No additional strings will be created.

TheLostMind
  • 35,966
  • 12
  • 68
  • 104
  • Thanks for the answer! I have one last question: let's say I set a second array to an interned array (`array2=internedArray`), would `array2` then contain the same interned String objects? – 17slim Aug 09 '16 at 15:23
  • 1
    @17slim - Yes. Both arrays will contain references to the same String literals (in the String constants pool). Remember Java is pass by value (even references are passed by value), so if you pass the array reference to other methods, the same String literals will be reused – TheLostMind Aug 09 '16 at 15:26
  • JVMs can’t do arbitrary automatic interning as it would change the semantics of the program if two explicitly created string instances suddenly happen to be the same instance. Or well, if the JVM can prove that the entire application will never notice this change of object identities, it would be possible, but I claim that performing such such prove would outweigh any benefits. What recent can JVMs do, is changing the internal array references, so that distinct string instances with equal contents are forced to use the same array instance. – Holger Aug 23 '16 at 17:13
  • @17slim: Note that in the example code snippets of your question, you are storing the result of the `intern()` operation in the map, but leave the array unchanged. So the array does not necessarily contain references to the intern'ed instances thus the interning can even cause *more* memory usage, depending on whether `intern()` returns a new instance or not, when being used in that way. – Holger Aug 23 '16 at 17:20
  • @Holger - Yes, but if I remember correctly, the JVM spec doesn't say that "it shouldn't be done", so some custom JVMs could at least theoreotically do some optimisation, but like I mentioned in my answer, I am not aware of any such optimisation in well known JVMs. Yes, java 1.8+ supports String Deduplication :) – TheLostMind Aug 23 '16 at 17:24
  • 1
    @TheLostMind: the specification doesn’t have to mention every forbidden thing explicitly. The way, the Java language is defined, forbids that `new Foo() == new Foo()` returns `true` and it also forbids that `a == b` changes its outcome for two ordinary object references, and these two rules do also apply to the `String` class. We can call the fact, that `String` has public constructors, a historical design mistake, but that’s the way it is today. – Holger Aug 23 '16 at 17:38
  • 1
    Of course, all JRE methods returning a `String` with unspecified identity could enable automatic interning, but for doing it at a later type, the JVM had to track whether the object identity was ever perceived in any way. So, well, it *is* theoretically possible within these limitations and we agree not to know a JVM doing such things today. – Holger Aug 23 '16 at 17:38