String storage in Java depends on how the string was obtained. The backing char
array can be shared between multiple instances. If that isn't the case, you have the usual object overhead plus storage for one pointer and three int
s which usually comes out to 16 bytes overhead. Then the backing array requires 2 bytes per char
since char
s are UTF-16 code units.
For "Apple Computers"
where the backing array is not shared, the minimum cost is going to be
- backing array for 16 chars -- 32B which aligns nicely on a word boundary.
- pointer to array - 4 or 8B depending on the platform
- three
int
s for the offset, length, and memoized hashcode - 12B
- 2 x object overhead - depends on the VM, but 8B is a good rule of thumb.
- one
int
for the array length.
So roughly 72B of which the actual payload constitutes 44.4%. The payload constitutes more for longer strings.
In Java7, some JDK implementations are doing away with backing array sharing to avoid pinning large char
[]s in memory. That allows them to do away with 2 of the three int
s.
That changes the calculation to 64B for a string of length 16 of which the actual payload constitutes 50%.