34

I'm trying to understand the nature of type-decay. For example, we all know arrays decay into pointers in a certain context. My attempt is to understand how int[] equates to int* but how two-dimensional arrays don't correspond to the expected pointer type. Here is a test case:

std::is_same<int*, std::decay<int[]>::type>::value; // true

This returns true as expected, but this doesn't:

std::is_same<int**, std::decay<int[][1]>::type>::value; // false

Why is this not true? I finally found a way to make it return true, and that was by making the first dimension a pointer:

std::is_same<int**, std::decay<int*[]>::type>::value; // true

And the assertion holds true for any type with pointers but with the last being the array. For example (int***[] == int****; // true).

Can I have an explanation as to why this is happening? Why doesn't the array types correspond to the pointer types as would be expected?

Me myself and I
  • 3,990
  • 1
  • 23
  • 47

4 Answers4

63

Why does int*[] decay into int** but not int[][]?

Because it would be impossible to do pointer arithmetic with it.

For example, int p[5][4] means an array of (length-4 array of int). There are no pointers involved, it's simply a contiguous block of memory of size 5*4*sizeof(int). When you ask for a particular element, e.g. int a = p[i][j], the compiler is really doing this:

char *tmp = (char *)p           // Work in units of bytes (char)
          + i * sizeof(int[4])  // Offset for outer dimension (int[4] is a type)
          + j * sizeof(int);    // Offset for inner dimension
int a = *(int *)tmp;            // Back to the contained type, and dereference

Obviously, it can only do this because it knows the size of the "inner" dimension(s). Casting to an int (*)[4] retains this information; it's a pointer to (length-4 array of int). However, an int ** doesn't; it's merely a pointer to (pointer to int).

For another take on this, see the following sections of the C FAQ:

(This is all for C, but this behaviour is essentially unchanged in C++.)

Oliver Charlesworth
  • 267,707
  • 33
  • 569
  • 680
  • Another way to look at this : `int[M][N]` to `int**` requires TWO conversions (which is disallowed) as opposed to one conversion (which is allowed). The first conversion requires conversion from `int[M][N]` into the *pointer* to the first element of the array. The type of [first] element is `int[N]`, so `int[M][N]` first converts into `int(*)[N]` which then needs to convert into `int**` requiring the first element of the *inner* array `int[N]` to convert into `int*`. – Nawaz Jan 06 '13 at 15:43
10

C was not really "designed" as a language; instead, features were added as needs arose, with an effort not to break earlier code. Such an evolutionary approach was a good thing in the days when C was being developed, since it meant that for the most part developers could reap the benefits of the earlier improvements in the language before everything the language might need to do was worked out. Unfortunately, the way in which array- and pointer handling have evolved has led to a variety of rules which are, in retrospect, unfortunate.

In the C language of today, there is a fairly substantial type system, and variables have clearly defined types, but things were not always thus. A declaration char arr[8]; would allocate 8 bytes in the present scope, and make arr point to the first of them. The compiler wouldn't know that arr represented an array--it would represent a char pointer just like any other char*. From what I understand, if one had declared char arr1[8], arr2[8];, the statement arr1 = arr2; would have been perfectly legal, being somewhat equivalent conceptually to char *st1 = "foo, *st2 = "bar"; st1 = st2;, but would have almost always represented a bug.

The rule that arrays decompose into pointers stemmed from a time when arrays and pointers really were the same thing. Since then, arrays have come to be recognized as a distinct type, but the language needed to remain essentially compatible with the days when they weren't. When the rules were being formulated, the question of how two-dimensional arrays should be handled wasn't an issue because there was no such thing. One could do something like char foo[20]; char *bar[4]; int i; for (i=0; i<4; i++) bar[i] = foo + (i*5); and then use bar[x][y] in the same way as one would now use a two-dimensional array, but a compiler wouldn't view things that way--it just saw bar as a pointer to a pointer. If one wanted to make foo[1] point somewhere completely different from foo[2], one could perfectly legally do so.

When two two-dimensional arrays were added to C, it was not necessary to maintain compatibility with earlier code that declared two-dimensional arrays, because there wasn't any. While it would have been possible to specify that char bar[4][5]; would generate code equivalent to what was shown using the foo[20], in which case a char[][] would have been usable as a char**, it was thought that just as assigning array variables would have been a mistake 99% of the time, so too would have been re-assignment of array rows, had that been legal. Thus, arrays in C are recognized as distinct types, with their own rules which are a bit odd, but which are what they are.

supercat
  • 77,689
  • 9
  • 166
  • 211
8

Because int[M][N] and int** are incompatible types.

However, int[M][N] can decay into int (*)[N] type. So the following :

std::is_same<int(*)[1], std::decay<int[1][1]>::type>::value;

should give you true.

Nawaz
  • 353,942
  • 115
  • 666
  • 851
3

Two dimensional arrays are not stored as pointer to pointers, but as a contiguous block of memory.

An object declared as type int[y][x] is a block of size sizeof(int) * x * y whereas, an object of type int ** is a pointer to an int*

Dave Hillier
  • 18,105
  • 9
  • 43
  • 87