4

I heard from a friend that two dimensional arrays in C are only supported syntactically.

He told me to better use float arr[M * N] instead of float[M][N] because C compilers like the gcc can't guarantee that on every system/platform the data lies in series within the memory.

I want to use this as an argument in my master thesis but I don't have any referrence.

So first question:

Is that right what he's saying?

Second question:

Do you know if there is a book or an article where to find this statement?

Thanks + Regards

Christoph
  • 164,997
  • 36
  • 182
  • 240
unlimited101
  • 3,653
  • 4
  • 22
  • 41
  • It's nonsense. `float[M*N]` need not give you a physically contiguous chunk of memory either, and `float[M][N]` is logically as contiguous. – Daniel Fischer Jun 15 '13 at 16:06
  • 1
    C Programming Language by Kernighan and Ritche is the best book out there. Check it out – priteshbaviskar Jun 15 '13 at 16:08
  • 1
    K&R is good, but getting more outdated by the year. It was last revised in 1988. – Carl Norum Jun 15 '13 at 16:09
  • @DanielFischer: No that's incorrect, `float[M*N]` and `float[M][N]` are _both_ required by the standard to be contiguous. – Jack Aidley Jun 15 '13 at 16:11
  • 2
    @JackAidley, the standard doesn't say anything about physical memory. I think that's the point Daniel is trying to get at. – Carl Norum Jun 15 '13 at 16:12
  • 1
    @JackAidley Logically. The Operating system may still give you several physical chunks. That may be a violation of the standard, but who's to control the memory manager of the OS? – Daniel Fischer Jun 15 '13 at 16:14
  • 3
    @DanielFischer: Ah, okay. I take your point. It's logical continuity that matters to the programmer though. – Jack Aidley Jun 15 '13 at 16:17
  • 4
    @DanielFischer If your are referring to virtual-physical mapping by an MMU, that's completely irrelevant in C since it's impossible to detect in C. Arrays and arrays of arrays are contiguous in memory, i.e. with increasing addresses without gaps, and sizeof(x[N][M]) == sizeof (x[N*M]). – Jens Jun 15 '13 at 16:18

4 Answers4

12
  1. No, he's wrong.

  2. Look at the C standard. Some relevant bits (bold emphasis mine):

    6.2.5 Types ¶20

    An array type describes a contiguously allocated nonempty set of objects with a particular member object type, called the element type.

    6.7.6.2 Array declarators ¶3 (note 142)

    When several "array of" specifications are adjacent, a multidimensional array is declared.

    6.5.2.1 Array subscripting ¶3

    Successive subscript operators designate an element of a multidimensional array object. ... It follows from this that arrays are stored in row-major order (last subscript varies fastest).

    And perhaps most explicitly, the example in 6.5.2.1 Array subscripting ¶4:

    EXAMPLE Consider the array object defined by the declaration

    int x[3][5];

    Here x is a 3 × 5 array of ints; more precisely, x is an array of three element objects, each of which is an array of five ints. In the expression x[i], which is equivalent to (*((x)+(i))), x is first converted to a pointer to the initial array of five ints. Then i is adjusted according to the type of x, which conceptually entails multiplying i by the size of the object to which the pointer points, namely an array of five int objects. The results are added and indirection is applied to yield an array of five ints. When used in the expression x[i][j], that array is in turn converted to a pointer to the first of the ints, so x[i][j] yields an int.

Multidimensional arrays in C are just "arrays of arrays". They work fine and are 100% defined by the standard.

You may also find it helpful to read Section 6, Arrays and Pointers in the comp.lang.c FAQ.

Carl Norum
  • 219,201
  • 40
  • 422
  • 469
  • So the C standard says that compilers should store arrays continuously. But do all of them follow that rule? That might be what his friend was talking about. I don't know the answer, just being curious. – ldiqual Jun 20 '13 at 07:51
  • I have never heard of one that does otherwise. Besides, if it didn't, it wouldn't be a C compiler, right? – Carl Norum Jun 20 '13 at 14:14
7

The issue is a bit more subtle than the other answers make it sound:

While multi-dimensional arrays are (semantically, possibly not physically) contiguous, pointer arithmetics is only defined if you stay within the bounds of the array your pointer originally referenced (actually, you can go 1 element past the upper bound, but only if you don't dereference).

This means that language semantics forbid walking through a multi-dimensional array from start to end, and a bounds-checking implementation of the C language (which are possible in principle but rarely seen in the wild for performance reasons) could raise a segfault, print a diagnostic or make demons fly from your nose whenever you cross a sub-array's boundary.

I'm not sure if compilers use this information for optimization purposes, but in principle, they could. For example, if you have

float *p = &arr[2][3];
float *q = &arr[5][9];

then p + x and q + y should never alias, regardless of the values of x and y.

Christoph
  • 164,997
  • 36
  • 182
  • 240
  • Indeed; see the answers to [this question](http://stackoverflow.com/questions/6290956/one-dimensional-access-to-a-multidimensional-array-well-defined-c)... – Oliver Charlesworth Jun 15 '13 at 16:47
4

Section 6.2.5.20 requires that arrays be contiguously allocated. This applies as much to an array of arrays as it does to a single dimensional array.

Your friend is simply wrong.

Jack Aidley
  • 19,439
  • 7
  • 43
  • 70
3

Built-in multi-dimensional arrays in C are implemented through index translation. This means that, for example, a 3D array T a[M][N][K] is implemented as a 1D array T a_impl[M * N * K], with multi-dimensional access a[i][j][k] being implicitly translated into the single-dimensional access a_impl[((i * N) + j) * K + k]. The language specification does not explicitly describe this implementation, however the requirements mandate it pretty much directly.

Taking this into account, it is not clear why your friend would tell you to use float arr[M * N] explicitly instead of relying on the implicit implementation of the same thing by the compiler.

The situation that might make you to consider float arr[M * N] approach is when both M and N are run-time values and your compiler does not support variable-length arrays (or you for some reason do not want to use them). In such cases the built-in support for multidimensional arrays is no longer applicable, since it relies on all sizes (except the first one) being compile-time constants. Maybe this is what your friend had in mind.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
  • Even if your compiler doesn't support VLAs, you can still allocate a real multidimensional array at runtime. It's just some (gross) syntax gymnastics to use it. – Carl Norum Jun 15 '13 at 16:36
  • @Carl Norum: One can allocate a real multidimensional array iff the second, third and further sizes are compile-time constants. (As illustrated by the above formula, the first size `M` does not participate in index recalculation). This is why I said that one will have to consider "manual" implementations when **both** `M` **and** `N` are run-time values. – AnT stands with Russia Jun 15 '13 at 16:39
  • There is also the issue that you cannot pass multidimensional arrays around without hardcoding its dimensions in the signatures of the functions that accept it. And, of course, doing that would be evil. So, better listen to your friend and do your own pointer arithmetic. – cmaster - reinstate monica Jun 15 '13 at 16:58
  • @cmaster: no need to hard-code dimensions - C99 has variably-modified types – Christoph Jun 15 '13 at 18:12