0

I'm trying to understand the different ways of declaring an array (of one or two dimensions) in C++ and what exactly they return (pointers, pointers to pointers, etc.)

Here are some examples:

int A[2][2] = {0,1,2,3};
int A[2][2] = {{0,1},{2,3}};
int **A = new int*[2];
int *A = new int[2][2];

In each case, what exactly is A? Is it a pointer, double pointer? What happens when I do A+1? Are these all valid ways of declaring matrices?

Also, why does the first option not need the second set of curly braces to define "columns"?

  • Where did you find your examples? – Cheers and hth. - Alf Oct 24 '14 at 22:36
  • @Cheersandhth.-Alf, mostly from questions my prof gave me. He intentionally tries to make it confusing sometimes. –  Oct 24 '14 at 22:37
  • @clcto, yeah, but I'm trying to understand what exactly A is so I can draw a small memory diagram and work with the structure. –  Oct 24 '14 at 22:38
  • The thing about pointers is they look weird at first glance. I doubt he was trying to confuse you. – Ryan Oct 24 '14 at 22:38
  • @self, I think he was. He asks a lot of "which of these declarations are valid" type questions. So some are supposed to look like they might work, but actually don't compile. These are from last week's test. –  Oct 24 '14 at 22:39
  • 3 first one are double pointers, last one is a single pointer. – Stasik Oct 24 '14 at 22:40
  • @Stasik, thanks. What's the difference between the first and second one though. The lack of curly braces in the first one seem ambiguous (what is A[0][1] and what is A[1][0])? Are they laid out in memory differently? –  Oct 24 '14 at 22:42
  • @Stasik: The first two ones are not double pointer, but 2 dimensionnal arrays of int. `int **A = new int*[2];` is an array of uninitialized pointer. – Jarod42 Oct 24 '14 at 22:43
  • @Jarod42, aren't those both the same thing? Just like an array is the same thing as a pointer? –  Oct 24 '14 at 22:44
  • @Jarod42 thanks for the hint http://stackoverflow.com/a/11556702/383834 – Stasik Oct 24 '14 at 22:51
  • The last assignment should be `int (*A)[2] = new int[2][2];`. – Jarod42 Oct 24 '14 at 22:55
  • 1
    Have a look at the following articles to understand how to interpret declarations: [The "Clockwise/Spiral Rule"](http://c-faq.com/decl/spiral.anderson.html) and [How to interpret complex C/C++ declarations](http://www.codeproject.com/Articles/7042/How-to-interpret-complex-C-C-declarations). – Remy Lebeau Oct 24 '14 at 23:43

7 Answers7

3

Looks like you got a plethora of answers while I was writing mine, but I might as well post my answer anyway so I don't feel like it was all for nothing...

(all sizeof results taken from VC2012 - 32 bit build, pointer sizes would, of course, double with a 64 bit build)

size_t f0(int* I);
size_t f1(int I[]);
size_t f2(int I[2]);

int main(int argc, char** argv)
{
    // A0, A1, and A2 are local (on the stack) two-by-two integer arrays
    // (they are technically not pointers)

    // nested braces not needed because the array dimensions are explicit [2][2]
    int A0[2][2] = {0,1,2,3};

    // nested braces needed because the array dimensions are not explicit,
    //so the braces let the compiler deduce that the missing dimension is 2
    int A1[][2] = {{0,1},{2,3}};

    // this still works, of course. Very explicit.
    int A2[2][2] = {{0,1},{2,3}};

    // A3 is a pointer to an integer pointer. New constructs an array of two
    // integer pointers (on the heap) and returns a pointer to the first one.
    int **A3 = new int*[2];
    // if you wanted to access A3 with a double subscript, you would have to
    // make the 2 int pointers in the array point to something valid as well
    A3[0] = new int[2];
    A3[1] = new int[2];
    A3[0][0] = 7;

    // this one doesn't compile because new doesn't return "pointer to int"
    // when it is called like this
    int *A4_1 = new int[2][2];

    // this edit of the above works but can be confusing
    int (*A4_2)[2] = new int[2][2];
    // it allocates a two-by-two array of integers and returns a pointer to
    // where the first integer is, however the type of the pointer that it
    // returns is "pointer to integer array"

    // now it works like the 2by2 arrays from earlier,
    // but A4_2 is a pointer to the **heap**
    A4_2[0][0] = 6;
    A4_2[0][1] = 7;
    A4_2[1][0] = 8;
    A4_2[1][1] = 9;


    // looking at the sizes can shed some light on subtle differences here
    // between pointers and arrays
    A0[0][0] = sizeof(A0);        // 16 // typeof(A0) is int[2][2] (2by2 int array, 4 ints total, 16 bytes)
    A0[0][1] = sizeof(A0[0]);     // 8  // typeof(A0[0]) is int[2] (array of 2 ints)

    A1[0][0] = sizeof(A1);        // 16 // typeof(A1) is int[2][2]
    A1[0][1] = sizeof(A1[0]);     // 8  // typeof(A1[0]) is int[2]

    A2[0][0] = sizeof(A2);        // 16 // typeof(A2) is int[2][2]
    A2[0][1] = sizeof(A2[0]);     // 8  // typeof(A1[0]) is int[2]

    A3[0][0] = sizeof(A3);        // 4 // typeof(A3) is int**
    A3[0][1] = sizeof(A3[0]);     // 4 // typeof(A3[0]) is int*

    A4_2[0][0] = sizeof(A4_2);    // 4 // typeof(A4_2) is int(*)[2] (pointer to array of 2 ints)
    A4_2[0][1] = sizeof(A4_2[0]); // 8 // typeof(A4_2[0]) is int[2] (the first array of 2 ints)
    A4_2[1][0] = sizeof(A4_2[1]); // 8 // typeof(A4_2[1]) is int[2] (the second array of 2 ints)
    A4_2[1][1] = sizeof(*A4_2);   // 8 // typeof(*A4_2) is int[2] (different way to reference the first array of 2 ints)

// confusion between pointers and arrays often arises from the common practice of
// allowing arrays to transparently decay (implicitly convert) to pointers

    A0[1][0] = f0(A0[0]); // f0 returns 4.
    // Not surprising because declaration of f0 demands int*

    A0[1][1] = f1(A0[0]); // f1 returns 4.
    // Still not too surprising because declaration of f1 doesn't
    // explicitly specify array size

    A2[1][0] = f2(A2[0]); // f2 returns 4.
    // Much more surprising because declaration of f2 explicitly says
    // it takes "int I[2]"

    int B0[25];
    B0[0] = sizeof(B0); // 100 == (sizeof(int)*25)
    B0[1] = f2(B0); // also compiles and returns 4.
    // Don't do this! just be aware that this kind of thing can
    // happen when arrays decay.

    return 0;
}

// these are always returning 4 above because, when compiled,
// all of these functions actually take int* as an argument
size_t f0(int* I)
{
    return sizeof(I);
}

size_t f1(int I[])
{
    return sizeof(I);
}

size_t f2(int I[2])
{
    return sizeof(I);
}

// indeed, if I try to overload f0 like this, it will not compile.
// it will complain that, "function 'size_t f0(int *)' already has a body"
size_t f0(int I[2])
{
    return sizeof(I);
}

yes, this sample has tons of signed/unsigned int mismatch, but that part isn't relevant to the question. Also, don't forget to delete everything created with new and delete[] everything created with new[]

EDIT:

"What happens when I do A+1?" -- I missed this earlier.

Operations like this would be called "pointer arithmetic" (even though I called out toward the top of my answer that some of these are not pointers, but they can turn into pointers).

If I have a pointer P to an array of someType, then subscript access P[n] is exactly the same as using this syntax *(P + n). The compiler will take into account the size of the type being pointed to in both cases. So, the resulting opcode will actually do something like this for you *(P + n*sizeof(someType)) or equivalently *(P + n*sizeof(*P)) because the physical cpu doesn't know or care about all our made up "types". In the end, all pointer offsets have to be a byte count. For consistency, using array names like pointers works the same here.

Turning back to the samples above: A0, A1, A2, and A4_2 all behave the same with pointer arithmetic.

A0[0] is the same as *(A0+0), which references the first int[2] of A0

similarly:

A0[1] is the same as *(A0+1) which offsets the "pointer" by sizeof(A0[0]) (i.e. 8, see above) and it ends up referencing the second int[2] of A0

A3 acts slightly differently. This is because A3 is the only one that doesn't store all 4 ints of the 2 by 2 array contiguously. In my example, A3 points to an array of 2 int pointers, each of these point to completely separate arrays of two ints. Using A3[1] or *(A3+1) would still end up directing you to the second of the two int arrays, but it would do it by offsetting only 4bytes from the beginning of A3 (using 32 bit pointers for my purposes) which gives you a pointer that tells you where to find the second two-int array. I hope that makes sense.

iwolf
  • 1,080
  • 1
  • 7
  • 10
2
int A[2][2] = {0,1,2,3};
int A[2][2] = {{0,1},{2,3}};

These declare A as array of size 2 of array of size 2 of int. The declarations are absolutely identical.

int **A = new int*[2];

This declares a pointer to pointer to int initialized with an array of two pointers. You should allocate memory for these two pointers as well if you want to use it as two-dimensional array.

int *A = new int[2][2];

And this doesn't compile because the type of right part is pointer to array of size 2 of int which cannot be converted to pointer to int.

In all valid cases A + 1 is the same as &A[1], that means it points to the second element of the array, that is, in case of int A[2][2] to the second array of two ints, and in case of int **A to the second pointer in the array.

Anton Savin
  • 40,838
  • 8
  • 54
  • 90
  • Awesome! A few clarifications though. I thought an array is the same thing as a pointer. In other words `A[1] = *(A+1)` correct? Also, in the case of the first example, what would I get from `*(A+2)`? It's "out of bounds" of the "rows" of that array, no? Same with the second example: `*(A)` would be `{0,1}` and `*(A+1)` would be `{2,3}` - but what would `*(A+2)` give? –  Oct 24 '14 at 22:48
  • @MaxMackie yes `*(A+2)` is out of bounds in all your cases. First and second example are identical, don't look at the "missing" braces, what matters is the type. – Anton Savin Oct 24 '14 at 22:50
  • @AntonSavin this will work though? int *A = new int[2][2]; – Stasik Oct 24 '14 at 22:52
  • @Stasik this is illegal as I wrote – Anton Savin Oct 24 '14 at 22:54
  • @Stasik: It should be `int (*A)[2] = new int[2][2];` – Jarod42 Oct 24 '14 at 22:54
  • @AntonSavin I meant int **A = new int[2][2], sorry. – Stasik Oct 24 '14 at 22:55
  • @Stasik no again, the types are different (like in first and third examples of OP) – Anton Savin Oct 24 '14 at 22:57
  • @Stasik: `int (*)[2]` is not the same type as `int **`, and not convertible moreover. – Jarod42 Oct 24 '14 at 22:57
2

For the array declaration, the first specified dimension is the outermost one, an array that contains other arrays.

For the pointer declarations, each * adds another level of indirection.

The syntax was designed, for C, to let declarations mimic the use. Both the C creators and the C++ creator (Bjarne Stroustrup) have described the syntax as a failed experiment. The main problem is that it doesn't follow the usual rules of substitution in mathematics.

In C++11 you can use std::array instead of the square brackets declaration.

Also you can define a similar ptr type builder e.g.

template< class T >
using ptr = T*;

and then write

ptr<int> p;
ptr<ptr<int>> q;
Cheers and hth. - Alf
  • 142,714
  • 15
  • 209
  • 331
1

The other answers have covered the other declarations but I will explain why you don't need the braces in the first two initializations. The reason why these two initializations are identical:

int A[2][2] = {0,1,2,3};
int A[2][2] = {{0,1},{2,3}};

is because it's covered by aggregate initialization. Braces are allowed to be "elided" (omitted) in this instance.

The C++ standard provides an example in § 8.5.1:

[...]

float y[4][3] = {
  { 1, 3, 5 },
  { 2, 4, 6 },
  { 3, 5, 7 },
};

[...]

In the following example, braces in the initializer-list are elided; however the initializer-list has the same effect as the completely-braced initializer-list of the above example,

float y[4][3] = {
  1, 3, 5, 2, 4, 6, 3, 5, 7
};

The initializer for y begins with a left brace, but the one for y[0] does not, therefore three elements from the list are used. Likewise the next three are taken successively for y[1] and y[2].

0

Ok I will try it to explain it to you:

  1. This is a initialization. You create a two dimensional array with the values:
    • A[0][0] -> 0
    • A[0][1] -> 1
    • A[1][0] -> 2
    • A[1][1] -> 3
  2. This is the exactly the same like above, but here you use braces. Do it always like this its better for reading.
  3. int **A means you have a pointer to a pointer of ints. When you do new int*[2] you will reserve memory for 2 Pointer of integer.
  4. This doesn't will be compiled.
Mosa
  • 373
  • 1
  • 14
0
int A[2][2] = {0,1,2,3};
int A[2][2] = {{0,1},{2,3}};

These two are equivalent.
Both mean: "I declare a two dimentional array of integers. The array is of size 2 by 2".

Memory however is not two dimensional, it is not laid out in grids, but (conceptionaly) in one long line. In a multi-dimensional array, each row is just allocated in memory right after the previous one. Because of this, we can go to the memory address pointed to by A and either store two lines of length 2, or one line of length 4, and the end result in memory will be the same.

int **A = new int*[2];

Declares a pointer to a pointer called A.
A stores the address of a pointer to an array of size 2 containing ints. This array is allocated on the heap.

int *A = new int[2][2];

A is a pointer to an int.
That int is the beginning of a 2x2 int array allocated in the heap.

Aparrently this is invalid:

prog.cpp:5:23: error: cannot convert ‘int (*)[2]’ to ‘int*’ in initialization
  int *A = new int[2][2];

But due to what we saw with the first two, this will work (and is 100% equivalent):

int *A new int[4];
Baldrickk
  • 4,291
  • 1
  • 15
  • 27
-2
int A[2][2] = {0,1,2,3};

A is an array of 4 ints. For the coder's convenience, he has decided to declare it as a 2 dimensional array so compiler will allow coder to access it as a two dimensional array. Coder has initialized all elements linearly as they are laid in memory. As usual, since A is an array, A is itself the address of the array so A + 1 (after application of pointer math) offset A by the size of 2 int pointers. Since the address of an array points to the first element of that array, A will point to first element of the second row of the array, value 2.

Edit: Accessing a two dimensional array using a single array operator will operate along the first dimension treating the second as 0. So A[1] is equivalent to A[1][0]. A + 1 results in equivalent pointer addition.

int A[2][2] = {{0,1},{2,3}};

A is an array of 4 ints. For the coder's convenience, he has decided to declare it as a 2 dimensional array so compiler will allow coder to access it as a two dimensional array. Coder has initialized elements by rows. For the same reasons above, A + 1 points to value 2.

int **A = new int*[2];

A is pointer to int pointer that has been initialized to point to an array of 2 pointers to int pointers. Since A is a pointer, A + 1 takes the value of A, which is the address of the pointer array (and thus, first element of the array) and adds 1 (pointer math), where it will now point to the second element of the array. As the array was not initialized, actually doing something with A + 1 (like reading it or writing to it) will be dangerous (who knows what value is there and what that would actually point to, if it's even a valid address).

int *A = new int[2][2];

Edit: as Jarod42 has pointed out, this is invalid. I think this may be closer to what you meant. If not, we can clarify in the comments.

int *A = new int[4];

A is a pointer to int that has been initialized to point to an anonymous array of 4 ints. Since A is a pointer, A + 1 takes the value of A, which is the address of the pointer array (and thus, first element of the array) and adds 1 (pointer math), where it will now point to the second element of the array.

Some takeaways:

  1. In the first two cases, A is the address of an array while in the last two, A is the value of the pointer which happened to be initialized to the address of an array.
  2. In the first two, A cannot be changed once initialized. In the latter two, A can be changed after initialization and point to some other memory.
  3. That said, you need to be careful with how you might use pointers with an array element. Consider the following:

    int *a = new int(5);
    int *b = new int(6);
    int c[2] = {*a, *b};
    int *d = a;
    

c+1 is not the same as d+1. In fact, accessing d+1 is very dangerous. Why? Because c is an array of int that has been initialized by dereferencing a and b. that means that c, is the address of a chunk of memory, where at that memory location is value which has been set to the value pointed to by tovariable a, and at the next memory location that is a value pinned to by variable b. On the other hand d is just the address of a. So you can see, c != d therefore, there is no reason that c + 1 == d + 1.

iheanyi
  • 3,107
  • 2
  • 23
  • 22
  • `int *A = new int[2][2];` is invalid. and your `int a = new int(5);` is also invalid, you probably mean `int* a = new int(5);` (but then `d` is also of the wrong type...) – Jarod42 Oct 24 '14 at 23:47
  • @Jarod42 Thanks, I'd initially wrote that with a and b on the stack (no new) and forgot the update. As for the other error, I completely glossed over that. – iheanyi Oct 25 '14 at 00:12
  • @Jarod42 crap, forgot that one too. This is what happens when you make last minute changes from the stack to the heap! – iheanyi Oct 25 '14 at 00:16
  • Yeah, I think the intent I was trying to convey is with the latest change int c[2] = {*a, *b}. Really, I was initially framing this with a struct/class, but then decided I could show the problem with a simpler example...and then proceed to trip myself up. – iheanyi Oct 25 '14 at 00:27
  • Now, your explanation mismatches with the code... `c` is not an *array of pointers to int*... – Jarod42 Oct 25 '14 at 00:38
  • Thanks @Jarod42, that was wrong from the beginning. – iheanyi Oct 25 '14 at 06:00
  • First two paragraphs are wrong - `A + 1` will point to `2`, not `1`. – Anton Savin Oct 25 '14 at 17:30
  • @AntonSavin, No. A is the address of the first element of the array. A + 1 is the address of the second element. Based on what I've written, the second element in both the first and second cases is the value 1. – iheanyi Oct 29 '14 at 21:28
  • @iheanyi The second element of `A` is `{2, 3}`. [You could just try it yourself](http://coliru.stacked-crooked.com/a/ca38e24a34c39701) – Anton Savin Oct 29 '14 at 21:32
  • @AntonSavin yup, you're right. With multi-dimensional arrays, a single operator will work on the first dimension, treating the second as 0. – iheanyi Oct 29 '14 at 21:50