1

I'm trying to see how much memory is used when I have a lot of duplicate strings. I am using the method highlighted in this answer (at the bottom)

Here's me creating a list of a ten million strings, where each string has only a few characters.

public class Test1 {

    public static void main(String[] args) {
        int count = 10000000;
        List<String> names = new ArrayList<String>();
        for (int i = 0; i < count; i++) {
            names.add("test");
        }
        Runtime rt = Runtime.getRuntime();
        long usedMem = rt.totalMemory() - rt.freeMemory();
        System.out.println(usedMem / (1024*1024) + " MB");
    }
}

I run it, and it says 88 MB. I am not too sure what this represents, but I'll just take it as a number to compare iwth.

Here's me doing the same test again, except I replaced the small string with some lorem ipsum text

public class Test1 {

    public static void main(String[] args) {
        int count = 10000000;
        List<String> names = new ArrayList<String>();
        for (int i = 0; i < count; i++) {
            names.add("Lorem ipsum dolor sit amet, brute euismod eleifend te quo, ne qui iudicabit hendrerit. Ea sit dolore assentior prodesset. In ludus adipiscing eos, ius erat graeco at, cu nec melius copiosae. Epicuri suavitate gubergren id sea, possim animal eu nam, cu error libris expetendis his. Te sea agam fabulas, vis eruditi complectitur ei. Ei sale modus vis, pri et iracundia temporibus. Mel mundi antiopam ad.");
        }
        Runtime rt = Runtime.getRuntime();
        long usedMem = rt.totalMemory() - rt.freeMemory();
        System.out.println(usedMem / (1024*1024) + " MB");
    }
}

I run this, and it says 88 MB again.

This is not meant to be an attempt to properly benchmark memory usage, but I was expecting the number for the ipsum lorem string to be somewhat larger because there are about 50x as many characters in the string.

How does Java store arrays of strings in memory? Or, am I doing something wrong?

Community
  • 1
  • 1
MxLDevs
  • 19,048
  • 36
  • 123
  • 194
  • 1
    This is not quite a duplicate of, but similar to, http://stackoverflow.com/questions/18669755/how-big-is-an-integer-in-java/18669841#18669841 – yshavit Feb 10 '14 at 22:15

3 Answers3

5

Your List<String> isn't storing strings. It's storing string references.

In each case, you've got a single String object, and then a list with a lot of references to the same object. It's like having a single house, and millions of pieces of paper all with the same address on. That takes roughly the same amount of land, whether the house is a bungalow or a mansion.

If you want to see what happens when you create a different string for each entry in the list to refer to, try:

for (int i = 0; i < count; i++) {
    names.add("test" + i);
}

Now you'll run out of memory much quicker, as on each iteration you'll be creating a new string object, which will take a certain amount of memory. The exact amount of memory depends on the implementation, but it's generally a String object containing a reference to a char[] object (an array of characters), a start position, a length, and a cached hashcode. So for small strings, the textual data is dwarfed by the overhead of the housekeeping work, whereas for very large strings the data in the char[] will take the bulk of the space.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • I understand now. The fact that the lorem ipsum string was only a few hundred characters longer didn't make a noticeable difference in this case, as there was still only one instance of it compared to ten million references. – MxLDevs Feb 10 '14 at 22:16
  • @MxyL: Exactly. Each of those references is only 4 or 8 bytes probably, but even so, that's a lot more than the space required by the single string. – Jon Skeet Feb 10 '14 at 22:16
5

You didn't create 1 million strings. You created 1 million references to the same, unique String instance. String literals are interned: everytime your code uses

String s = "hello";

it uses a String "hello" placed in a pool. If 85 classes declare such a variable, they will all have a reference to this same String in the pool.

If you really want 1 million String instances, then use

list.add(new String("..."));

That will make a copy of the interned String, and you'll have 1 million different instances.

JB Nizet
  • 678,734
  • 91
  • 1,224
  • 1,255
3

Java uses string pooling. That means when you have an identical string multiple times, they are actually pointing to the same instance.

Philipp
  • 67,764
  • 9
  • 118
  • 153