25

I have seen dozens of questions about “what’s wrong with my code” regarding multidimensional arrays in C. For some reason people can’t seem to wrap their head around what is happening here, so I decided to answer this question as a reference to others:

How do I correctly set up, access, and free a multidimensional array in C?

If others have helpful advice, please feel free to post along!

Mike
  • 47,263
  • 29
  • 113
  • 177

5 Answers5

30

In C since C99, even dynamic multidimensional arrays can be easily allocated in one go with malloc and freed with free:

double (*A)[n] = malloc(sizeof(double[n][n]));

for (size_t i = 0; i < n; ++i)
  for (size_t j = 0; j < n; ++j)
      A[i][j] = someinvolvedfunction(i, j);

free(A);
Jens Gustedt
  • 76,821
  • 6
  • 102
  • 177
  • This is the preferred way, avoid pointer-to-pointer syntax. I'm not sure, but I believe this worked in C90 too however? Surely array pointers were around before C99? At least "mangled" arrays worked, ie `double* A = malloc(x*y*sizeof(double));`. – Lundin Sep 17 '12 at 15:54
  • 1
    @Lundin, no unfortunately the declaration part `double (*A)[n]` only worked if `n` was a compile time constant, basically a macro or `enum` constant. – Jens Gustedt Sep 17 '12 at 15:56
  • 1
    Aha, well I guess it doesn't make much sense to allocate dynamically with the size known at compile time :) Although, is the 'n' mandatory? Couldn't you write `double (*A)[] =` ? – Lundin Sep 17 '12 at 16:02
  • 2
    @Lundin: sometimes it makes sense to allocate dynamically with the size known at compile time, because a multi-dimensional array can pretty easily blow the stack. – Steve Jessop Sep 17 '12 at 16:05
  • @Lundin, the `[]` notation only works if the size can be deduced from the initializer expression. This would never be the case for `malloc`ed pointers to arrays. – Jens Gustedt Sep 17 '12 at 16:07
  • Great answer! I actually was not aware of this as an option. – Mike Sep 17 '12 at 19:34
  • 1
    @JensGustedt Can you return A from a function, and if so what is the return type? – Scooter Sep 21 '12 at 16:43
  • @Scooter, no pointer to VLA as such can unfortunately not be returned from function. You could have return `void*`, but the size and type information would be lost. In the syntactically weird corners of C you can return pointers to arrays of fixed size `double (*foo(void)[34][8]` or undetermined size `double (*foo(void)[]`. I think syntactically it would be possible to exetend also to VLA, perhaps I will put that on my list: http://p99.gforge.inria.fr/defects-and-improvements – Jens Gustedt Sep 21 '12 at 17:57
  • Make dynamic everything that you could want to reshape, and static everything you know that it wouldn't be reshaped.This case is preferred only if you don't want to reshape both dimensions. – Ramy Al Zuhouri Nov 05 '12 at 17:30
17

There are at least four different ways to create or simulate a multi-dimensional array in C89.

One is "allocate each row separately", described by Mike in his answer. It is not a multidimensional array, it merely imitates one (in particular it mimics the syntax for accessing an element). It can be useful in the case where each row has different size, so you aren't representing a matrix but rather something with a "ragged edge".

One is "allocate a multidimensional array". It looks likes this:

int (*rows)[NUM_ROWS][NUM_COLS] = malloc(sizeof *rows);
...
free(rows);

Then the syntax to access element [i,j] is (*rows)[i][j]. In C89, both NUM_COLS and NUM_ROWS must be known at compile-time. This is a true 2-D array, and rows is a pointer to it.

One is, "allocate an array of rows". It looks like this:

int (*rows)[NUM_COLS] = malloc(sizeof(*rows) * NUM_ROWS);
...
free(rows);

Then the syntax to access element [i,j] is rows[i][j]. In C89, NUM_COLS must be known at compile-time. This is a true 2-D array.

One is, "allocate a 1-d array and pretend". It looks like this:

int *matrix = malloc(sizeof(int) * NUM_COLS * NUM_ROWS);
...
free(matrix);

Then the syntax to access element [i,j] is matrix[NUM_COLS * i + j]. This (of course) is not a true 2-D array. In practice it has the same layout as one.

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
  • "allocate an array of rows", isn't this rather: allocate an array of arrays, then assign an array pointer to point at the first object/array? I always use this form myself, though perhaps the "2D" pointer is more stylistically correct? – Lundin Sep 17 '12 at 16:00
  • 1
    @Lundin: it's both. In all forms (except arguably the flattened array), each row is an array, so an array of rows is an array of arrays. But since a multidimensional array *is* an array of arrays anyway (by definition in the standard), my titles technically don't distinguish between them. To me, the difference in emphasis is clear, maybe not to others. – Steve Jessop Sep 17 '12 at 16:02
  • I think the first example is most readable and appealing, so perhaps this is the recommended form? Perhaps you could edit the post and give a recommended one, for present and future readers? At least it would be a shame if people rushed off to use the third, "mangled" syntax :) – Lundin Sep 17 '12 at 16:06
  • @Lundin: I certainly don't want to claim that one "is recommended", because I don't like appealing to the spurious authority of the passive voice ;-) – Steve Jessop Sep 17 '12 at 16:08
  • 1
    After given this some thought, I would definitely say that the first version is to prefer, because it would enable the chance for a compiler or static analysis tool to enforce "stronger typing", by detecting and warning against incorrect, implicit type conversions. The 2nd and 3rd forms could accidentally get mixed up with plain 1D arrays or plain pointers without a chance for any tools to detect possible bugs. – Lundin Sep 17 '12 at 16:37
  • 1
    No disrespect to your analysis, which I think is probably correct, but if I prefer something I just say I prefer it, I try to remember not to say it "is preferred". My concerns might not be the same as someone else's, and in particular in C89 the need for bounds known at compile-time is quite limiting. The syntax for the first option is not all that inviting but certainly it allows for static bounds-checking by the compiler in both dimensions rather than just one. – Steve Jessop Sep 17 '12 at 16:44
  • Can you break this into two lines int (*rows)[NUM_COLS] = malloc(sizeof(*rows) * NUM_ROWS); Is it rows = malloc(sizeof(*rows) * NUM_ROWS) or *rows = malloc(..)? – Sandeep Oct 14 '15 at 03:23
  • 1
    @mk..: the first one. – Steve Jessop Oct 14 '15 at 07:31
  • Thank you for the reply. I would also like to ask, is there any difference when we allocate it in the same line and when we allocate the memory separately. I am not able to correlate how the rows and columns would be representing a contiguous block when they are allocated at different times. – Sandeep Oct 15 '15 at 02:07
8

Statically speaking, this is easy to understand:

int mtx[3][2] = {{1, 2},
                 {2, 3},
                 {3, 4}};

Nothing complicated here. 3 rows, 2 columns; data in column one: 1, 2, 3; data in column two: 2, 3, 4. We can access the elements via the same construct:

for(i = 0; i<3; i++){
    for(j = 0; j<2; j++)
        printf("%d ", mtx[i][j]);
    printf("\n");
}
//output
//1 2
//2 3
//3 4

Now let’s look at this in terms of Pointers:

The brackets are a very nice construct to help simplify things, but it doesn’t help when we need to work in a dynamic environment, so we need to think of this in terms of pointers. If we want to store a “row” of integers, we need an array:

int row[2] = {1,2};

And you know what? We can access this just like a pointer.

printf("%d, %d\n",*row,*(row+1));   //prints 1, 2
printf("%d, %d\n",row[0],row[1]);   //prints 1, 2

Now if we don’t know the number of values in a row we can make this array a dynamic length if we have a pointer to int, and we give it some memory:

int *row = malloc(X * sizeof(int));  //allow for X number of ints
*row = 1;        //row[0] = 1
*(row+1) = 2; //row[1] = 2
…
*(row+(X-1)) = Y; // row[x-1] = Some value y

So now we have a dynamic 1 dimensional array; a single row. But we want lots of rows, not just one, and we don’t know how many. That means we need another dynamic 1 dimensional array, each element of that array will be a pointer which points to a row.

//we want enough memory to point to X number of rows
//each value stored there is a pointer to an integer
int ** matrix = malloc(X * sizeof(int *));

//conceptually:
(ptr to ptr to int)     (pointer to int)
   **matrix ------------> *row1 --------> [1][2]
                          *row2 --------> [2][3]
                          *row3 --------> [3][4]

Now all that’s left to do is to write the code which will perform these dynamic allocations:

int i, j, value = 0;

//allocate memory for the pointers to rows
int ** matrix = malloc(Rows * sizeof(int*));

//each row needs a dynamic number of elements
for(i=0; i<Rows; i++){
    // so we need memory for the number of items in each row… 
    // we could call this number of columns as well
    *(matrix + i) = malloc(X * sizeof(int));

     //While we’re in here, if we have the items we can populate the matrix
    for(j=0; j<X; j++)
        *(*(matrix+i)+j) = value; // if you deference (matrix + i) you get the row
                                  // if you add the column and deference again, you
                                  // get the actual item to store (not a pointer!)
}

One of the most important things to do now is to make sure we free the memory when we’re done. Each level of malloc() should have the same number of free() calls, and the calls should be in a FILO order (reverse of the malloc calls):

for(i=0; i<Rows; i++) 
    free(*(matrix + i));
free(matrix);

//set to NULL to clean up, matrix points to allocated memory now so let’s not use it!
matrix = NULL; 
Mike
  • 47,263
  • 29
  • 113
  • 177
  • 2
    Good answer, but please don't use the pointer-to-pointer syntax, it creates segmented multi-dim. arrays that are not compatible with statically allocated arrays, nor with C standard library functions like memcpy, memset, bsearch, qsort etc. See Jens' answer for the preferred way to allocate dynamic multi-dim. arrays. – Lundin Sep 17 '12 at 15:51
  • @Lundin - A great point, I chose to use the pointer-to-pointer syntax since that's how I was taught it back in the day and I think it's still being taught that way (based on the questions I've seen on SO) – Mike Sep 17 '12 at 16:03
  • 1
    It is not “syntax”. Syntax is rules about language or, colloquially, a particular sample of language. Issues of syntax are issues of expression and communication. The problem with the pointer-to-pointer method is not merely the language it uses but the wasteful actions it causes in the program: More memory is used than necessary (for the unneeded pointers and for the extra accounting when each row is allocated separately), more time is used than necessary (loading a pointer each time a row is accessed and extra allocation calls), and the code is more complex than necessary. – Eric Postpischil Sep 17 '12 at 16:15
  • @EricPostpischil It is syntax, because the type used is `int**` rather than `int (*)[]`. – Lundin Sep 17 '12 at 16:17
  • 3
    @Lundin: That is like saying the difference between Paris and a thermonuclear bomb is spelling, because one is spelled “Paris” and the other is spelled “thermonuclear bomb”. In fact, it is not the syntax that is the core difference or the difference with the greatest effect. The syntax is only a means of communication; it is the thing being communicated that is the real problem. Another way of seeing this is to translate it to another language: Suppose the syntax were swapped but the underlying behavior remained the same. Would that be better? No, the double-pointer problem would remain. – Eric Postpischil Sep 17 '12 at 16:31
1

If you want to use a typedef'd array, it is even simpler.

Say you have in your code typedef int LabeledAdjMatrix[SIZE][SIZE];

You can then use:

LabeledAdjMatrix *pMatrix = malloc(sizeof(LabeledAdjMatrix));

Then you can write:

for (i=0; i<SIZE; i++) {
    for (j=0; j<SIZE; j++) (*parr)[i][j] = k++; /* or parr[0][i][j]... */
}

Because pArr is a pointer to you matrix and * has lower priority than [];

That is why a mode common idiom is to typedef the row:

typedef int LabeledAdjRow[SIZE];

Then you can write:

LabeledAdjRow *pMatrix = malloc(sizeof(LabeledAdjRow) * SIZE);
for (i=0; i<SIZE; i++) {
    for (j=0; j<SIZE; j++) parr[i][j] = k++;
}

And you can memcpy all that directly:

LabeledAdjRow *pOther = malloc(sizeof(LabeledAdjRow) * SIZE);
memcpy(pOther, pMatrix, sizeof(LabeledAdjRow) * SIZE);
Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • I know it is a poor answer for the current question, but it is directly targeted at this other [question](http://stackoverflow.com/q/39951988/3545273) that was closed as a duplicate.... – Serge Ballesta Oct 10 '16 at 06:46
1

Going off of Jen's answer, if you want to allocate space for a 3D array, in the same manner, you can do

int(*A)[n][n] = malloc(sizeof(int[num_of_2D_arrays][n][n]));

for (size_t p = 0; p < num_of_2D_arrays; p++)
  for (size_t i = 0; i < n; i++)
    for (size_t j = 0; j < n; j++)
      A[p][i][j] = p;

for (size_t p = 0; p < num_of_2D_arrays; p++)
    printf("Outter set %lu\n", p);
    for (size_t i = 0; i < n; i++)
      for (size_t j = 0; j < n; j++)
        printf(" %d", A[p][i][j]);
      printf("\n");

free(A);
Shawn T
  • 83
  • 1
  • 5