11

my question is really simple (which doesn't imply that the answer will be as simple.. :D )

why do arrays in C++ include the size as part of the type and Java's do not?

I know that Java array reference variables are just pointers to arrays on the heap,but so are C++ pointers to arrays,but I need to provide a size even then. Let's analyze C++ first:

// in C++ :

// an array on the stack:
int array[*constexpr*]; 

// a bidimensional array on the stack:                            
int m_array[*constexpr1*][*constexpr2*]; 

// a multidimensional array on the stack:
int mm_array[*constexpr1*][*constexpr2*][*constexpr3*];

// a dynamic "array" on the heap:
int *array = new int[n];

// a dynamic bidimensional "array" on the heap:               
int (*m_array)[*constexpr*] = new int[n][*constexpr*];  

// a dynamic multidimensional "array" on the heap:
int (*mm_array)[*constexpr*][*constexpr*] = new int [n][*constexpr1*][*constexpr2*];

n doesn't have to be a compile time constant expression,all the elements are default initialized. Dynamically allocated "arrays" are not of type array,but the new expression yields a pointer to the first element.

So when I create a dynamic array,all dimensions apart the first one,must be constant expressions (otherwise I couldn't declare the pointer to hold their elements). Is it right??

Now to Java.I can only allocate array on the heap,since this is how Java works:

// a dynamic array on the heap:
 int[] array = new int[n];

// a dynamic bidimensional array on the heap:               
 int[][] m_array = new int[n][];  

// a dynamic multidimensional array on the heap:
 int[][][] mm_array = new int [n][][];

In Java, it doesn't seem to care about array size when defining an array reference variable (it's an error in Java to explicitly provide a size),and so I just need to provide the size for the first dimension when creating the array. This allows me to create jagged array,which I'm not sure I can create in C++ (not arrays of pointers).

can someone explain me how's that? maybe what's happening behind the curtains should make it clear. Thanks.

Luca
  • 1,658
  • 4
  • 20
  • 41
  • 1
    `In Java, it doesn't seem to care about array size when defining an array reference variable (it's an error in Java to explicitly provide a size` this is not true. Can you give a coding example on this? – user3437460 Sep 24 '15 at 17:19
  • The first thing to memorize is that C/C++ DOES NOT HAVE MULTIDEMENSIONAL ARRAYS. – SergeyA Sep 24 '15 at 17:20
  • @user3437460 int [10] array; array reference variable. Illegal in Java – Luca Sep 24 '15 at 17:21
  • @Luca That is because `int[10] array` is a wrong syntax in Java. You can provide array size in Java `int [] array = new int[10]`. Am I answering your question? – user3437460 Sep 24 '15 at 17:25
  • @user3437460 I think you should read the question more carefully. And distinguish between a reference variable and a object instance. – Luca Sep 24 '15 at 17:26
  • @Luca You probably confused yourself with the terminologies you carried over from C/C++ background. In Java, we do not manually deal with the object references. We only create objects (instances). The variables itself points to a particular memory address. In Java, we don't play with the addresses and references ourselves. Most things are done for you behind our backs. – user3437460 Sep 24 '15 at 17:43
  • @user3437460 maybe we must agree on the use of some terms... in Java a reference variable is what a pointer is in C++ (not a C++ reference). I know that we're not allowed to access object themselves, we do everything via "variable references" (that are pointer-like things). That doesn't answer my question anyway – Luca Sep 24 '15 at 17:54
  • @Luca Let me convince you with supporting text from Oracle documentations. I will post my answer in a min. – user3437460 Sep 24 '15 at 18:09

7 Answers7

11

That's because in Java, all arrays are single-dimensional. A two-dimensional array in Java is merely an array of references to one-dimensional arrays. A three-dimensional array in Java is merely a one-dimensional array of references to arrays of references to arrays of whatever base type you wanted.

Or in C++ speak, an array in Java, if it's not an array of primitive, it's an "array of pointers".

So, for example, this code:

    int[][][] arr3D = new int [5][][];

    System.out.println(Arrays.deepToString(arr3D));

Would yield the output:

[null, null, null, null, null]

You can decide to initialize one of its elements:

    arr3D[2] = new int[3][];

And the output from the same println would now be:

[null, null, [null, null, null], null, null]

Still no ints here... Now we can add:

    arr3D[2][2] = new int[7];

And now the result will be:

[null, null, [null, null, [0, 0, 0, 0, 0, 0, 0]], null, null]

So, you can see that this is an "array of pointers".

In C++, when you allocate a multi-dimensional array the way you described, you are allocating a contiguous array which actually holds all the dimensions of the array and is initialized all the way through to the ints. To be able to know whether it's a 10x10x10 array or a 100x10 array, you have to mention the sizes.

Further explanation

In C++, the declaration

int (*mm_array)[5][3];

means "mm_array is a pointer to a 5x3 array of integers". When you assign something to it, you expect that thing to be a pointer to a contiguous block of memory, which is at least big enough to contain 15 integers, or maybe an array of several such 5x3 arrays.

Suppose you didn't mention that "5" and "3".

int (*mm_array)[][]; // This is not a legal declaration in C++

Now, suppose you are handed a pointer to a newly allocated array, and we have statements like:

mm_array[1][1][1] = 2;

Or

mm_array++;

In order to know where to put the number, it needs to know where index 1 of the array is. Element 0 is easy - it's right at the pointer. But where is element 1? It's supposed to be 15 ints after that. But at compile time, you won't know that, because you didn't give the sizes. The same goes for the ++. If it doesn't know that each element of the array is 15 ints, how will it skip that many bytes?

Furthermore, when is it a 3x5 or a 5x3 array? If it needs to go to element mm_array[0][2][1], does it need to skip two rows of five elements, or two rows of three elements?

This is why it needs to know, at compile time, the size of its base array. Since the pointer has no information about sizes in it, and merely points to a contiguous block of integer, that information will need to be known in advance.

In Java, the situation is different. The array itself and its sub-arrays, are all Java objects. Each array is one-dimensional. When you have an expression like

arr3D[0][1][2]

arr3D is known to be a reference to an array. That array has length and type information, and one dimension of references. It can check whether 0 is a valid index, and dereference the 0th element, which is itself a reference to an array.

Which means that now it has type and length information again, and then a single dimension of references. It can check whether 1 is a valid index in that array. If it is, it can go to that element, and dereference it, and get the innermost array.

Since the arrays are not a contiguous block, but rather references to objects, you don't need to know sizes at compile time. Everything is allocated dynamically, and only the third level (in this case) has actual contiguous integers in it - only a single dimension, which does not require advance calculation.

RealSkeptic
  • 33,993
  • 7
  • 53
  • 79
  • that explains why you have to provide a constant expression as the size for all dimensions (save the first for dynamic "arrays"),and why in Java you just need provide the first one. But it doesn't explain why the reference variable in Java (the pointer to the array type to speak C++) is declared without sizes – Luca Sep 24 '15 at 17:38
  • @Luca it doesn't need the sizes because it doesn't need to know how to divide the long contiguous block that will be assigned to it into rows, columns, etc. Because no such block will be assigned to it. It only has one dimension. – RealSkeptic Sep 24 '15 at 17:39
  • mmmm...I think I'm beginning to understand,but I'm not there yet. Could you please provide a more explicit explanation,or point me to some resource I can read to understand? – Luca Sep 24 '15 at 17:58
  • @Luca I added further explanation, I hope it's sufficient. – RealSkeptic Sep 24 '15 at 18:46
  • When Java lost from her predecessor objects at stack, object in area " like blockdata in Fortran "true" c static constants., lost resident "fat" object only by reference many thinks are simpler. Language is simpler, compiler is simpler, can learning better. From the beginning by design: should by language for mass educated programmers. – Jacek Cz Sep 24 '15 at 19:12
  • Such thread are blocked or deleted "as unclear" Happy no one hunter goes here ;) – Jacek Cz Sep 24 '15 at 19:14
  • When you say "that array has size and type information", you mean that at run time I know when I step out of the array (there are no buffer overflow bugs) but at compile time I only know that an array is,say, an array of type int[][] (an array that contains references to arrays of ints) – Luca Sep 25 '15 at 07:36
  • @Luca At compile time you only know what type of array is the reference supposed to refer to, and thus, what operations are supposed to be legal (i.e. using the `length`, indexing elements, assigning another array reference etc.) At run time, the variable will either be `null` or refer to an array *object* which, in Java, includes length, type, and the elements of the array (single dimension). – RealSkeptic Sep 25 '15 at 08:02
1

I guess your real question is, why a stack array must have a fixed size at compile time.

Well, for one, that makes it easier to calculate the addresses of following local variables.

Dynamic size for stack array isn't impossible, it's just more complicated, as you would imagine.

C99 does support variable length arrays on stack. Some C++ compilers also support this feature. See also Array size at run time without dynamic allocation is allowed?

Community
  • 1
  • 1
ZhongYu
  • 19,446
  • 5
  • 33
  • 61
1

The difference between in arrays in C++ and Java is that Java arrays are references, like all non-primitive Java objects, while C++ arrays are not, like all C++ objects (yes, you hear a lot that C++ arrays are like pointers, but see below).

Declaring an array in C++ allocates memory for the array.

int a[2];
a[0] = 42;
a[1] = 64;

is perfectly legal. However, to allocate memory for the array you must know its size.

Declaring an array in Java does not allocate memory for the array, only for the reference, so if you do:

int[] a;
a[0] = 42;

you'll get a NullPointerException. You first have to construct the array (and also in Java, to construct the array you need to know its size):

int[] a = new int[2];
a[0] = 42;
a[1] = 64;

So what about C++ array being pointers? Well, they are pointers (because you can do pointer arithmetic with them) but they are constant pointers whose value is not actually stored in the program but known at compile time. For this reason the following C++ code will not compile:

int a[2];
int b[2];
a = b;
Hoopje
  • 12,677
  • 8
  • 34
  • 50
0

Correction:

C sometimes has dimension

Java

 Sometype some[];

declaration is itself an (declaration of) reference to Object and can be changed (to new instance or array). This may be one reason so in java dimension cannot be given "on the left side". Its near to

Sometype * some 

in C (forgive me, array in Java is much more intelligent and safe) if we think about pass array to C function, formal situation is similar like in Java. Not only we don't have dimension(s), but cannot have.

void func(Sometype arg[])
{
 // in C totally unknown (without library / framework / convention  etc)
 // in Java formally not declared, can be get at runtime
}
NathanOliver
  • 171,901
  • 28
  • 288
  • 402
Jacek Cz
  • 1,872
  • 1
  • 15
  • 22
0

I believe this has to do with what code the compiler issues to address the array. For dynamic arrays you have an array of arrays and cells are addressed by redirecting a redirection.

But multidimensional arrays are stored in contiguous memory and the compiler indexes them using a mathematical formula to calculate the cell position based upon each of the array's dimensions.

Therefore the dimensions need to be known (declared) to the compiler (all except the last one).

Galik
  • 47,303
  • 4
  • 80
  • 117
  • It doesn't explain why the reference is declared without a size – Luca Sep 24 '15 at 17:23
  • @Luca Array references, just like arrays need to declare the size of all but one of their dimensions because the last dimension is simply scalar but the others are needed to multiply together to calculate the cell index. – Galik Sep 24 '15 at 17:26
  • @Luca Sorry got that wrong, array references need to declare all their dimensions it seems. (so long since I used arrays in c++). – Galik Sep 24 '15 at 17:29
  • no,Galik, in Java when you declare an array reference variable,you must not declare the size.It's an error if you do. – Luca Sep 24 '15 at 17:35
0

In Java, it doesn't seem to care about array size when defining an array reference variable (it's an error in Java to explicitly provide a size),

It is not Java doesn't care about the initial array size when you define an array. The concept of an array in Java is almost totally different from C/C++.

First of all the syntax for creating an array in Java is already different. The reason why you are still seeing C/C++ look-alike square brackets in Java when declaring arrays is because when Java was implemented, they tried to follow the syntax of C/C++ as much as possible.

From Java docs:

Like declarations for variables of other types, an array declaration has two components: the array's type and the array's name. An array's type is written as type[], where type is the data type of the contained elements; the brackets are special symbols indicating that this variable holds an array. The size of the array is not part of its type (which is why the brackets are empty)

When you declare an array in Java, for e.g.:

int[] array;

You are merely creating an object which Java called it an array (which acts like an array).

The brackets [ ] are merely symbol to indicate this is an Array object. How could you insert numbers into a specific symbol which Java uses it to create an Array Object!!

The brackets looks like what we used in C/C++ array declaration. But Java gives a different meaning to it to the syntax looks like C/C++.

Another description from Java docs:

Brackets are allowed in declarators as a nod to the tradition of C and C++.


Part of your question:

This allows me to create jagged array,which I'm not sure I can create in C++ (not arrays of pointers).

From Java Docs:

In the Java programming language, a multidimensional array is an array whose components are themselves arrays. This is unlike arrays in C or Fortran. A consequence of this is that the rows are allowed to vary in length

If you are interested to find out more on Java Arrays, visit:

user3437460
  • 17,253
  • 15
  • 58
  • 106
  • **Remarks:** This answer does not fully address all issues brought up by the OP, as it will take a much longer answer to answer all his queries. However this answer does reply certain aspects of OP's question. Particularly on Java empty brackets declaration. – user3437460 Sep 24 '15 at 18:31
0

You're confusing the meaning of some of your C++ arrays: e.g., your 'm_array' is a pointer to an array of values - see the following compilable C++ example:

int array_of_values[3] = { 1, 2, 3 };
int (*m_array)[3] = &array_of_values;

the equivalent Java is:

int[] array_of_values = {1, 2, 3};
int[] m_array = array_of_values;

similarly, your 'mm_array' is a pointer to an array of arrays:

int array_of_array_of_values[3][2] = { 1, 2, 3, 4, 5, 6 };
int (*mm_array)[3][2] = &array_of_array_of_values;

the equivalent Java is:

int[][] array_of_array_of_values = { {1, 2}, {3, 4}, {5, 6} };
int[][] mm_array = array_of_array_of_values;
Dave Doknjas
  • 6,394
  • 1
  • 15
  • 28