How are multi-dimensional arrays formatted in memory?

Question

In C, I know I can dynamically allocate a two-dimensional array on the heap using the following code:

int** someNumbers = malloc(arrayRows*sizeof(int*));

for (i = 0; i < arrayRows; i++) {
    someNumbers[i] = malloc(arrayColumns*sizeof(int));
}

Clearly, this actually creates a one-dimensional array of pointers to a bunch of separate one-dimensional arrays of integers, and "The System" can figure out what I mean when I ask for:

someNumbers[4][2];

But when I statically declare a 2D array, as in the following line...:

int someNumbers[ARRAY_ROWS][ARRAY_COLUMNS];

...does a similar structure get created on the stack, or is it of another form completely? (i.e. is it a 1D array of pointers? If not, what is it, and how do references to it get figured out?)

Also, when I said, "The System," what is actually responsible for figuring that out? The kernel? Or does the C compiler sort it out while compiling?

@toohonestforthissite Indeed. To expand on that: **Looping and calling `malloc()` does not result in an N-dimensional array.**. It results in arrays of pointers [to arrays of pointers[...]] to completely separate **one-dimensional** arrays. See [**Correctly allocating multi-dimensional arrays**](https://stackoverflow.com/questions/42094465/correctly-allocating-multi-dimensional-arrays) to see how to allocate **TRUE** N-dimensional array. — Andrew Henle, Jan 11 '20 at 14:16

score 173 · Accepted Answer · edited Feb 13 '16 at 22:55

173

A static two-dimensional array looks like an array of arrays - it's just laid out contiguously in memory. Arrays are not the same thing as pointers, but because you can often use them pretty much interchangeably it can get confusing sometimes. The compiler keeps track properly, though, which makes everything line up nicely. You do have to be careful with static 2D arrays like you mention, since if you try to pass one to a function taking an int ** parameter, bad things are going to happen. Here's a quick example:

int array1[3][2] = {{0, 1}, {2, 3}, {4, 5}};

In memory looks like this:

0 1 2 3 4 5

exactly the same as:

int array2[6] = { 0, 1, 2, 3, 4, 5 };

But if you try to pass array1 to this function:

void function1(int **a);

you'll get a warning (and the app will fail to access the array correctly):

warning: passing argument 1 of ‘function1’ from incompatible pointer type

Because a 2D array is not the same as int **. The automatic decaying of an array into a pointer only goes "one level deep" so to speak. You need to declare the function as:

void function2(int a[][2]);

or

void function2(int a[3][2]);

To make everything happy.

This same concept extends to n-dimensional arrays. Taking advantage of this kind of funny business in your application generally only makes it harder to understand, though. So be careful out there.

edited Feb 13 '16 at 22:55

Greg Kennedy

430
4
23

answered Apr 02 '10 at 04:49

Carl Norum

219,201
40
422
469

Thanks for the explanation. So "void function2(int a[][2]);" will accept both statically and dynamically declared 2D's? And I guess it's still good practice/essential to pass in the length of the array as well if the first dimension is left as []? – Chris Cooper Apr 02 '10 at 17:32
1

@Chris I don't think so - you'll have a hard time making C swizzle a stack- or globally-allocated array into a bunch of pointers. – Carl Norum Apr 03 '10 at 16:00
Certainly a[][] should work! In C every array is really just a pointer with an offset, the memory allocation is the only real difference. I wouldn't expect the warning for a** mentioned above to prevent you from compiling, though that would depend on the compiler and settings. – Jason K. Oct 10 '16 at 00:23
9

@JasonK. - no. Arrays are not pointers. Arrays "decay" into pointers in some contexts, but they are absolutely *not* the same. – Carl Norum Oct 10 '16 at 01:25
2

To be clear: Yes Chris "it's still good practice to pass in the length of the array" as a separate parameter, otherwise use std::array or std::vector (which is C++ not old C). I think we agree @CarlNorum both conceptually for new users and practically, to quote Anders Kaseorg on Quora: “The first step to learning C is understanding that pointers and arrays are the same thing. The second step is understanding that pointers and arrays are different.” – Jason K. Oct 15 '16 at 21:23
2

@JasonK. "The first step to learning C is understanding that pointers and arrays are the same thing." - This quote is so very wrong and missleading! It is indeed the most important step to understand they are **not** the same, but that arrays are **converted** to a pointer to the **first element** for most operators! `sizeof(int[100]) != sizeof(int *)` (unless you find a platform with `100 * sizeof(int)` bytes/`int`, but that is a different thing. – too honest for this site Nov 18 '16 at 18:40

score 91 · Answer 2 · edited Aug 15 '12 at 23:01

The answer is based on the idea that C doesn't really have 2D arrays - it has arrays-of-arrays. When you declare this:

int someNumbers[4][2];

You are asking for someNumbers to be an array of 4 elements, where each element of that array is of type int [2] (which is itself an array of 2 ints).

The other part of the puzzle is that arrays are always laid out contiguously in memory. If you ask for:

sometype_t array[4];

then that will always look like this:

| sometype_t | sometype_t | sometype_t | sometype_t |

(4 sometype_t objects laid out next to each other, with no spaces in between). So in your someNumbers array-of-arrays, it'll look like this:

| int [2]    | int [2]    | int [2]    | int [2]    |

And each int [2] element is itself an array, that looks like this:

| int        | int        |

So overall, you get this:

| int | int  | int | int  | int | int  | int | int  |

looking at the final layout make me think that int a[][] can be accessed as int *...right? — Narcisse Doudieu Siewe, May 22 '15 at 15:54
@user3238855: The types are not compatible, but if you get a pointer to the first `int` in the array of arrays (eg. by evaluating `a[0]` or `&a[0][0]`) then yes, you can offset that to sequentially access every `int`). — caf, May 22 '15 at 22:03

score 32 · Answer 3 · edited Feb 01 '13 at 13:21

32

unsigned char MultiArray[5][2]={{0,1},{2,3},{4,5},{6,7},{8,9}};

in memory is equal to:

unsigned char SingleArray[10]={0,1,2,3,4,5,6,7,8,9};

edited Feb 01 '13 at 13:21

ollo

24,797
14
106
155

answered Feb 01 '13 at 13:02

kanghai

329
3
3

score 5 · Answer 4 · answered Apr 02 '10 at 05:13

5

In answer to your also: Both, though the compiler is doing most of the heavy lifting.

In the case of statically allocated arrays, "The System" will be the compiler. It will reserve the memory like it would for any stack variable.

In the case of the malloc'd array, "The System" will be the implementer of malloc (the kernel usually). All the compiler will allocate is the base pointer.

The compiler is always going to handle the type as what they are declared to be except in the example Carl gave where it can figure out interchangeable usage. This is why if you pass in a [][] to a function it must assume that it is a statically allocated flat, where ** is assumed to be pointer to pointer.

answered Apr 02 '10 at 05:13

Jon L

273
1
7

@Jon L. I would not say that malloc is implemented by the kernel, but by the libc on top of kernel primitives (such as brk) – Manuel Selva Nov 02 '16 at 15:36
@ManuelSelva: Where and how `malloc` is implemented is not specified by the standard and left to the implementation, resp. environment. For freestanding environments it is optional like all parts of the standard library requiring linking functions (that's what the requirements actually result in, not literaly what the standard states). For some modern hosted environments, it indeed relies on kernel functions, either the complete stuff, or (e.g. Linux) as you wrote using both, stdlib and kernel-primitives. For non-virtual memory single-process systems, it can be stdlib only. – too honest for this site Nov 18 '16 at 18:46

sqr163 · Answer 5 · 2020-01-13T21:37:32.300

Suppose, we have a1 and a2 defined and initialized like below (c99):

int a1[2][2] = {{142,143}, {144,145}};
int **a2 = (int* []){ (int []){242,243}, (int []){244,245} };

a1 is a homogeneous 2D array with plain continuous layout in memory and expression (int*)a1 is evaluated to a pointer to its first element:

a1 --> 142 143 144 145

a2 is initialized from a heterogeneous 2D array and is a pointer to a value of type int*, i.e. dereference expression *a2 evaluates into a value of type int*, memory layout does not have to be continuous:

a2 --> p1 p2
       ...
p1 --> 242 243
       ...
p2 --> 244 245

Despite totally different memory layout and access semantics, C-language grammar for array-access expressions looks exactly the same for both homogeneous and heterogeneous 2D array:

expression a1[1][0] will fetch value 144 out of a1 array
expression a2[1][0] will fetch value 244 out of a2 array

Compiler knows that the access-expression for a1 operates on type int[2][2], when the access-expression for a2 operates on type int**. The generated assembly code will follow the homogeneous or heterogeneous access semantics.

The code usually crashes at run-time when array of type int[N][M] is type-casted and then accessed as type int**, for example:

((int**)a1)[1][0]   //crash on dereference of a value of type 'int'

score 1 · Answer 6 · answered Feb 09 '16 at 12:51

To access a particular 2D array consider the memory map for an array declaration as shown in code below:

    0  1
a[0]0  1
a[1]2  3

To access each element, its sufficient to just pass which array you are interested in as parameters to the function. Then use offset for column to access each element individually.

int a[2][2] ={{0,1},{2,3}};

void f1(int *ptr);

void f1(int *ptr)
{
    int a=0;
    int b=0;
    a=ptr[0];
    b=ptr[1];
    printf("%d\n",a);
    printf("%d\n",b);
}

int main()
{
   f1(a[0]);
   f1(a[1]);
    return 0;
}

How are multi-dimensional arrays formatted in memory?

6 Answers6

Linked

Related