5

I have initialized a 2D array like a 1D array:

int a[2][3] = {1,2,3,4,5} 

How are the values stored in this array?

Ardent Coder
  • 3,777
  • 9
  • 27
  • 53

3 Answers3

5

They are assigned as follows:

1 2 3
4 5 0

The zero is because you've allocated an array of size 6, but only specified 5 elements.

This is called "row-major order".

You may wish to formalize your code slightly. Your code is currently:

int a[2][3] = {1,2,3,4,5};

If you compile this with gcc main.c -Wall -pedantic --std=c99, you'll get a few warnings:

temp.c:2:17: warning: missing braces around initializer [-Wmissing-braces]

Resolve this using

int a[2][3] = {{1,2,3,4,5}};

This will give you a new warning:

temp.c:2:25: warning: excess elements in array initializer

Resolve this using:

int a[2][3] = {{1,2,3},{4,5,0}};

This explicitly represents the data as having two rows of three elements each.

Some thoughts on memory layout

int a[2][3] will produce an "array of arrays". This is similar to, but contradistinct from, an "array of pointers to arrays". Both have similar access syntax (e.g. a[1][2]). But only for the "array of arrays" can you reliably access elements using a+y*WIDTH+x.

Some code might clarify:

#include <stdlib.h>
#include <stdio.h>

void PrintArray1D(int* a){
  for(int i=0;i<6;i++)
    printf("%d ",a[i]);
  printf("\n");
}

int main(){
  //Construct a two dimensional array
  int a[2][3] = {{1,2,3},{4,5,6}};

  //Construct an array of arrays
  int* b[2];
  b[0] = calloc(3,sizeof(int));
  b[1] = calloc(3,sizeof(int));

  //Initialize the array of arrays
  for(int y=0;y<2;y++)
  for(int x=0;x<3;x++)
    b[y][x] = a[y][x];

  PrintArray1D(a[0]);
  PrintArray1D(b[0]);
}

When you run this, you get:

1 2 3 4 5 6 
1 2 3 0 0 0 

Printing b gives zeros (on my machine) because it runs into uninitialized memory. The upshot is that using contiguous memory allows you to do handy things, let set all of the values without needing a double loop.

Richard
  • 56,349
  • 34
  • 180
  • 251
  • Just out of curiosity, what warning would be triggered by `int a[2][3] = {{1,2},{3,4},{5}}`? – Mad Physicist Jun 16 '17 at 19:43
  • 1
    "This is not necessarily so." - Please elaborate! "An array which contains to int[3] arrays may be non-contiguous in memory." - That's wrong. `int a[2][3]` **is** an array of arrays and guaranteed to be continous in memory! It cannot be anything else. If you refer to something like `int *a[3]`: that is a completely different datatype and not an array of arrays! – too honest for this site Jun 16 '17 at 21:02
  • @Olaf: Exactly. `int a[2][3]` is contiguous in memory and all accesses are valid. However, the `a[2][3]` access pattern on an array of arrays is not guaranteed to be valid in situations where you allocate the rows separately. – Richard Jun 16 '17 at 22:10
  • @MadPhysicist: I get `warning: excess elements in array initializer` pointing that the bracket preceding the 5. It indicates too many rows, rather than too few elements. – Richard Jun 16 '17 at 22:11
  • So you basically say a different data type is different? Not really a surprise! – too honest for this site Jun 16 '17 at 22:20
  • @Olaf: Right? As my answer says, this text was in response to someone else's answer. – Richard Jun 16 '17 at 22:27
  • @Richard: 1) That answer has been deleted now. 2) You alrady commented at the answer. Don't comment other answers in your answer. 3) That answer does not even mention an array of pointers! Maybe you have something in mind, but as-is, part of your answer is missleading. – too honest for this site Jun 16 '17 at 22:33
  • @Olaf: I've edited my answer. You're welcome to edit if you think you can better clarify the intended point. – Richard Jun 16 '17 at 22:41
  • You did not change the wrong part! You cannot allocate the rows seperately in an array of arrays! I think my original comment was very clear. You explain apples by demonstrating oranges! – too honest for this site Jun 16 '17 at 22:49
  • @Olaf, if the original comment was clear, we would not now be having this discussion. I am sorry I did not understand it as you meant. It occurs to me that perhaps you meant to distinguish between "array of arrays" and "array of pointers to arrays". Is this so? – Richard Jun 16 '17 at 22:51
  • What's unclear about "If you refer to something like int *a[3]: that is a completely different datatype and not an array of arrays!" ? – too honest for this site Jun 16 '17 at 22:55
  • Well, as I said, that is what I'd been trying to express. And now we've gone in a circle. Though you didn't respond to my previous question, I think I understand you now. I think your original comment would have been more helpful to me had you said that I'd conflated an "array of arrays" with an "array of pointers to arrays". As it is, it took us several comments to reach that point. I'd recommend that, in future, you focus on more lucid explanations of what you think is correct, rather than expressing outrage (you have a wealth of exclamation points, rhetorical questions, and sarcasm). – Richard Jun 16 '17 at 23:27
3

In C, 1D arrays are stored in a single linear buffer in memory in what is called "row major" order. Row major means that the last index varies fastest as you go from element to element. Column major would mean that the first index varies fastest, as it does in MATLAB, for example.

The array you declared is only 2D in the sense that the compiler helps you out by computing the linear address of the elements for you. The address of an element in a 1D array is computed linear[x] = linear + x. Similarly, for your 2D array, a[y][x] = a + 3 * y + x. In general, a[y][x] = a + num_cols * y + x.

You can initialize the array as a single vector of elements, which will first fill the first row, then the second, and so on. Since you have two rows of three elements each, the first row becomes 1, 2, 3 and the second row becomes 4, 5, 0.

Indexing past the end of a row is perfectly valid, as far as the compiler is concerned at least. In the example you give, a[0][3] is accessing the fourth element of the first row in an array that is three elements wide. With wrap-around, you can see that this is just the first element of the second row, which is more explicitly stated as a[1][0].

Because of the lax index checking, you can completely omit the first index in any array as long as you provide an initializer. The formula to compute the linear address does not depend on the first index (because it is row major), and the total number of elements is specified by the initializer itself. A 1D example is int linear[] = {1, 2, 3};.

Keep in mind that the name of the array also refers to the pointer to its first element. These are two different things that can be accessed by the same name.

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
  • A general statement "C does not do any bounds checking on pointer arithmetic and array indexing" is actually not true, as even pointer arithmetics beyond array bounds are undefined behaviour; It's even a discussion if violating the bounds of a sub-array is undefined behaviour. – Stephan Lechner Jun 16 '17 at 20:20
  • @StephanLechner. Fair enough, although it is true that the compiler specifically allows you to trigger this sort of undefined behavior. I removed the offending lines since, as you pointed out, they were not relevant to the discussion. – Mad Physicist Jun 16 '17 at 20:23
  • Allowing this initialiser is actually a legacy and not related to the layout of an array in memory. Moern compilers will warn about such code. – too honest for this site Jun 16 '17 at 20:58
  • 1
    Technically, the reason a flat list of initializers initializes an array in row-major order is not because of how arrays are stored in memory but because of the C rules for initialization. In the 2011 standard, section 6.7.9, clauses 17 to 20 describe a process of using the list to initialize the declared object, including its subobjects. Hypothetically, those clauses could be written to specify a different order, and then the initialization would occur in that order even though arrays are stored in row-major order. – Eric Postpischil Jun 16 '17 at 22:50
2

From the definition of how an access to a 2D-array like a[1][2] is interpreted, "It follows from this that arrays are stored in row-major order" (cf, for example, this online C standard comitee draft / array subscripting). This means that for an array int a[ROWS][COLUMNS] for an access a[r][c] the offset in terms of int values is calculated like (r*COLUMNS + c).

So for an array int a[2][3], an access a[0][1] has offset 0*3 + 1 = 1, and an access a[1][0] has an offset 1*3 + 0 = 3. That said, a[0][3] might lead to offset 3, while a[1][0] for sure leads to 3. I wrote "might" because I think that accessing an array int a[2][3] with a[0][3] is undefined behaviour, as the range of the last subscript is 0..2. So according to 6.5.6 (8), expression a[0][3] is addressing the sub-array a[0] out of its bounds as argued, for example, here.

Now to the thing of how int a[2][3] = {1,2,3,4,5} is interpreted. This statement is initialization as defined in section 6.7.9 of this online C standard comitee draft, and paragraphs (20) to (26) describe the things needed here:

(20) If the aggregate or union contains elements or members that are aggregates or unions, these rules apply recursively to the subaggregates or contained unions. If the initializer of a subaggregate or contained union begins with a left brace, the initializers enclosed by that brace and its matching right brace initialize the elements or members of the subaggregate or the contained union. Otherwise, only enough initializers from the list are taken to account for the elements or members of the subaggregate or the first member of the contained union; any remaining initializers are left to initialize the next element or member of the aggregate of which the current subaggregate or contained union is a part.

(21) If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.

26 EXAMPLE

(3) The declaration

      int y[4][3] = {
            { 1, 3, 5 },
            { 2, 4, 6 },
            { 3, 5, 7 },
      };

is a definition with a fully bracketed initialization: 1, 3, and 5 initialize the first row of y (the array object y[0]), namely y[0][0], y[0][1], and y[0][2]. Likewise the next two lines initialize y[1] and y[2]. The initializer ends early, so y[3] is initialized with zeros. Precisely the same effect could have been achieved by

      int y[4][3] = {
            1, 3, 5, 2, 4, 6, 3, 5, 7
      };

The initializer for y[0] does not begin with a left brace, so three items from the list are used. Likewise the next three are taken successively for y[1] and y[2].

Stephan Lechner
  • 34,891
  • 4
  • 35
  • 58
  • 1
    cppreference is not an authoritative resource. Why not reference the standard, resp. the final draft? – too honest for this site Jun 16 '17 at 20:59
  • @Olaf: right, cppreference is not normative; But I was just looking for examples, and I found cppreference more brief here. Anyway, do you have a final draft that is online available? I always just find the one I used for citation... – Stephan Lechner Jun 16 '17 at 21:15
  • Every body is a critic nowadays - [§ 6.7.9 Initialization (p21)](http://port70.net/~nsz/c/c11/n1570.html#6.7.9p21) is key, good citation. – David C. Rankin Mar 25 '20 at 06:04