I have an object in Java which contains a String. I am curious how the memory usage of a String works. I'm trying to optimize memory usage for my program and the application will have about 10000 such objects. For a String such as "Hello World" what would the memory usage be?
-
2http://stackoverflow.com/questions/9699071/what-is-the-javas-internal-represention-for-string-modified-utf-8-utf-16 – jdevelop Nov 04 '13 at 20:26
-
The memory usage for "Hello World" is probably 11 or 22 bytes, depending on the encoding. However, if you have the same string in several places they may be the same object and you only spend memory for the reference. – Cruncher Nov 04 '13 at 20:27
-
1@Cruncher: there is no "depending on the encoding" in Java. Internally all Java String objects are stored as UTF-16 (or some version of that). – Nov 04 '13 at 20:29
-
@a_horse_with_no_name Doesn't `UseCompressedStrings` change that calculus a bit? – Vidya Nov 04 '13 at 20:33
-
@Vidya: hmm, good point. I don't really know to be honest. – Nov 04 '13 at 20:39
1 Answers
Java uses two bytes per character *, so you would need to multiply the number of characters by two to get a rough approximation. In addition to the storage of the "payload", you would need to account for the space allocated to the reference to your string, which usually equals to the size of a pointer on your target architecture, the space for the length of the string, which is an int
, and the space for the cached hash code, which is another int
.
Since, "Hello World"
is 11 characters long, I would estimate its size as 2*11+4+4+4=34 bytes on computers with 32-bit pointers, or 2*11+8+4+4=38 bytes on computers with 64-bit pointers.
Note: this estimate does not consider the effects of interning string constants. When a string is interned, all references to the interned string share the same payload, so the extra memory per additional instance of an interned string is the size of a reference (i.e. the pointer size on the target architecture).
* Unless the
-XX:+UseCompressedStrings
option is used, in which case the strings that do not need UTF-16 use UTF-8 encoding.
- 714,442
- 84
- 1,110
- 1,523
-
6Don't forget about internal strings (String.intern) and the pool for them. I can declare `String s = "Hi.";` in every class, but it will shared across all instances, so it's not X instances times s.length(). It's roughly 4 + instance pointer size * instances. – MadConan Nov 04 '13 at 20:35