5

I have read some answers for this question(Why I can't create an array with large size? and https://bugs.openjdk.java.net/browse/JDK-8029587) and I don't understand the following. "In the GC code we pass around the size of objects in words as an int." As I know the size of a word in JVM is 4 bytes. According to this, if we pass around the size of long array of large size (for example, MAX_INT - 5) in words as an int, we must get OutOfMemoryException with Requested array size exceeds VM limit because the size is too large for int even without size of header. So why arrays of different types have the same limit on max count of elements?

Community
  • 1
  • 1
WildWind03
  • 301
  • 3
  • 14
  • They don't. The limit is that both the allocated size in words and the number of elements must fit in a signed 32-bit integer. For arrays of reference types, that size is the same as the number of elements (barring possibly header words). For byte/short the size cap won't be hit. On 32-bit platforms I'd expect a long[] to be limited to axint/2 items (roughly). – Zastai Apr 24 '17 at 15:40
  • There is answer to this question in your question (first comment): https://bugs.openjdk.java.net/browse/JDK-8029587 – vhula Apr 24 '17 at 15:43
  • 1
    @Zastai If it was true, there wouldn't be any questions but actually for example I can't create an array of bytes with MAX_INT size – WildWind03 Apr 24 '17 at 16:50

2 Answers2

2

Only addressing the why arrays of different types have the same limit on max count of elements? part:

Because it doesn't matter to much in practical reality; but allows the code implementing the JVM to be simpler.

When there is only one limit; that is the same for all kinds of arrays; then you can deal all arrays with that code. Instead of having a lot of type-specific code.

And given the fact that the people that need "large" arrays can still create them; and only those that need really really large arrays are impacted; why spent that effort?

GhostCat
  • 137,827
  • 25
  • 176
  • 248
  • It would be clear if that one limit was based on long type. I can't understand how I can create a long array of (MAX_INT - 1) if it's impossible to index words of that array using an int variable – WildWind03 Apr 24 '17 at 16:46
  • I have to admit I was in a rush when writing my answer. I might update it tomorrow if you don't receive better input by then. – GhostCat Apr 24 '17 at 17:02
  • After seeing the code as cited in the other answer, I don’t have the feeling that this made the code simpler, or well, the behavior doesn’t even seem to be intentional. And “really really large” depends on the point of view. `new byte[Integer.MAX_VALUE]` isn’t that impressive nowadays and it feels quiet strange not being able to allocate that, while allocating `new int[Integer.MAX_VALUE-8]`, almost four times that memory, works without problems. – Holger Jul 04 '18 at 14:19
1

The answer is in the jdk sources as far as I can tell (I'm looking at jdk-9); also after writing it I am not sure if it should be a comment instead (and if it answers your question), but it's too long for a comment...

First the error is thrown from hotspot/src/share/vm/oops/arrayKlass.cpp here:

if (length > arrayOopDesc::max_array_length(T_ARRAY)) {
   report_java_out_of_memory("Requested array size exceeds VM limit");
    ....
}

Now, T_ARRAY is actually an enum of type BasicType that looks like this:

public static final BasicType T_ARRAY = new BasicType(tArray);
// tArray is an int with value = 13

That is the first indication that when computing the maximum size, jdk does not care what that array will hold (the T_ARRAY does not specify what types will that array hold).

Now the method that actually validates the maximum array size looks like this:

 static int32_t max_array_length(BasicType type) {
      assert(type >= 0 && type < T_CONFLICT, "wrong type");
      assert(type2aelembytes(type) != 0, "wrong type");

      const size_t max_element_words_per_size_t =
      align_size_down((SIZE_MAX/HeapWordSize - header_size(type)), MinObjAlignment);
      const size_t max_elements_per_size_t =
      HeapWordSize * max_element_words_per_size_t / type2aelembytes(type);
      if ((size_t)max_jint < max_elements_per_size_t) {
         // It should be ok to return max_jint here, but parts of the code
         // (CollectedHeap, Klass::oop_oop_iterate(), and more) uses an int for
         // passing around the size (in words) of an object. So, we need to avoid
         // overflowing an int when we add the header. See CRs 4718400 and 7110613.
         return align_size_down(max_jint - header_size(type), MinObjAlignment);
      }
       return (int32_t)max_elements_per_size_t;
}

I did not dive too much into the code, but it is based on HeapWordSize; which is 8 bytes at least. here is a good reference (I tried to look it up into the code itself, but there are too many references to it).

Eugene
  • 117,005
  • 15
  • 201
  • 306
  • Why HeapWordSize is 8 bytes at least? I found here http://hg.openjdk.java.net/jdk6/jdk6/hotspot/file/tip/src/share/vm/utilities/globalDefinitions.hpp that MaxHeapSize = sizeof(HeapWord) and HeapWord class has only one field of char* type. It means that size of HeapWordSize depends on architecture and may be 4 bytes as well – WildWind03 Apr 25 '17 at 17:03
  • 1
    @Wild_Wind you're right and I'm not. At this point I have the same struggle understating and you do. – Eugene Apr 25 '17 at 19:53
  • But why on Earth does `max_array_length` receive a parameter when it is supposed to be always `T_ARRAY`? I first thought, it is actually supposed to be the element type, as indicated by the use in `type2aelembytes(type)`, but then, it uses `header_size(type)`. This is inconsistent as hell. And the size in words can only overflow when the element type’s size is at least the word size, which implies an object size of `Integer.MAX_VALUE×word size`, for 32 bit JVMs impossible to allocate and for 64 Bit JVMs only applying to `long[]` and `double[]`, irrelevant to any JVM with less than 16 GB heap… – Holger Jul 04 '18 at 14:12
  • Have looked up up the code, the parameter *is* the array element type and the cited caller is only responsible for allocating an array of array. The `header_size(type)` function is also dedicate for returning the size of the header of an array of the specified type. So this implementation *does* care about the array’s element type. The problem is that nonsensical `max_elements_per_size_t` which may report a number of elements that Java can never allocate (higher than 2³¹). – Holger Jul 04 '18 at 14:54