1

I know how to calculate a Java object's memory size by adding three parts: header + properties + reference.

I also know that a Java array is an object too.

But when I read "Understanding the JVM advanced Features and Best Practices, second Edition", it says Java array's header consists of three parts; mark word, class pointer and array length.

It will always be 24 bytes in Hotspot 64-bits JVM.

But in a 32-bit JVM, how can I calculate a Java array's memory size?

I hope you guys can give me some example Java code to show me how to calculate an object's memory size not, limited to array object.

Holger
  • 285,553
  • 42
  • 434
  • 765
shengbang he
  • 128
  • 9

2 Answers2

2

The actual object size is implementation specific and there is not even a requirement that the needed size for an object stays the same during its lifetime.

There’s an article on wiki.openjdk.java.net stating:

Object header layout

An object header consists of a native-sized mark word, a klass word, a 32-bit length word (if the object is an array), a 32-bit gap (if required by alignment rules), and then zero or more instance fields, array elements, or metadata fields. (Interesting Trivia: Klass metaobjects contain a C++ vtable immediately after the klass word.)

The gap field, if it exists, is often available to store instance fields.

If UseCompressedOops is false (and always on ILP32 systems), the mark and klass are both native machine words. For arrays, the gap is always present on LP64 systems, and only on arrays with 64-bit elements on ILP32 systems.

If UseCompressedOops is true, the klass is 32 bits. Non-arrays have a gap field immediately after the klass, while arrays store the length field immediately after the klass.

You calculation “header + properties + reference” for an object’s size is not correct. First, references to an object are not part of the referent’s object size. There can be an arbitrary numbers of references to the same object, but these references don’t have to be in the heap memory or in RAM at all, as optimized code may access an object purely via CPU registers.

Further, as hinted in the quote above, there are alignment rules which make the calculation of the memory required for the fields nontrivial. There might be a gap in the header which might be used for storing instance fields, if there are fields of a type fitting into it. While the fields of the same class may get arranged to minimize the padding, a subclass has to live with the superclass’ layout, potentially adding more fields to it and may only fill in the gaps if it has fields of fitting types, otherwise, there might be even more gaps due to the class hierarchy.

For arrays, you can derive from the quoted article that the 32 bit HotSpot representation uses a header of 12 bytes, unless the type is long[] or double[], in which case it will be 16 bytes.

For the 64 bit implementation, the UseCompressedOops option (which is on by default), allows to combine a 64 bit mark word with a 32 bit klass and 32 bit length, to a header of 16 bytes total. Only if UseCompressedOops is off, the header will be 24 bytes.

Holger
  • 285,553
  • 42
  • 434
  • 765
  • you are absolutely correct about inheritance... https://bugs.openjdk.java.net/browse/JDK-8024913; you are almost right about `16 bytes` on long/double - it's actually still `12 bytes`, but long and double are "special" and can't take that 4 bytes left from headers..., so the extra 4 bytes is padding actually. – Eugene May 24 '18 at 14:59
  • 1
    @Eugene I referred to the citation’s wording which considers the “gap field” as part of the header in the first paragraph. Granted, that’s a bit strange considering the follow-up sentence “The gap field, if it exists, is often available to store instance fields” as taking this literally, it implies that there can be instance fields in the object header. I think the best approach would be to forget about the term “object header” completely and only talk about “klass”, “mark word” and “gap”. – Holger May 24 '18 at 15:36
1

You can test this using JOL framework, written by the all mighty Aleksey Shipilev.

Using it is actually pretty easy, first let's define the Layouts you care about:

Layouter layout32Bits =  new HotSpotLayouter(new X86_32_DataModel());
Layouter layout64Bits = new HotSpotLayouter(new X86_64_DataModel());
Layouter layout64BitsComp = new HotSpotLayouter(new X86_64_COOPS_DataModel());

And then, let's define an array and see the results:

int [] ints = new int[1];
System.out.println(ClassLayout.parseInstance(ints, layout32Bits).toPrintable());
System.out.println(ClassLayout.parseInstance(ints, layout64Bits).toPrintable());
System.out.println(ClassLayout.parseInstance(ints, layout64BitsComp).toPrintable());

Let's run each at a time. For the 32bits VM:

  [I object internals:
  OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
  0     4        (object header)                           09 00 00 00 (00001001 00000000 00000000 00000000) (9)
  4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
  8     4        (object header)                           10 0b 40 29 (00010000 00001011 01000000 00101001) (692062992)
 12    40    int [I.<elements>                             N/A
 52    12        (loss due to the next object alignment)
 Instance size: 64 bytes
 Space losses: 0 bytes internal + 12 bytes external = 12 bytes total

So you get 12 bytes for headers, (4 + 4 for the two headers, plus 4 for the size of the array, it's an int); then you get 40 bytes for the 10 ints that the array will hold.

Next is something I am not entirely sure I understand. So far we have 52 bytes and objects are aligned on 8 bytes, meaning this 52 value should be rounded to 56 bytes to align it to 8.

Instead it says 12 bytes loss due to the next object alignment. I can only guess there could potentially be two things, first read the comments here or may be some field is there for internal purpose only.

Not going to show the rest of the examples output here (you can do that too) and I will ask a follow-up question about the thing that is not clear to me about padding shortly.

Eugene
  • 117,005
  • 15
  • 201
  • 306