3

I am wondering if the C++ standard guarantees that multidimensional arrays (not dynamically allocated) are flattened into a 1D array of exactly the same space. For example, if I have

char x[100];
char y[10][10];

Would these both be equivalent? I'm aware that most compilers would flatten y, but is this actually guaranteed to happen? Reading section 11.3.4 Arrays of the C++ Standard, I cannot actually find anywhere that guarantees this.

The C++ standard guarantees that y[i] follows immediately after y[i-1]. Since y[i-1] is 10 characters long, then, logically speaking, y[i] should take place 10 characters later in memory; however, could a compiler pad y[i-1] with extra characters to keep y[i] aligned?

ChrisMM
  • 8,448
  • 13
  • 29
  • 48
  • What, exactly, are you trying to accomplish with this? Accessing `y[0][10]` will be UB. – 1201ProgramAlarm Nov 04 '19 at 17:37
  • 1
    Sounds like what you are really asking is if you can iterate a 2d array as if it is a 1d one. The answer to that is legally no, but most/all implementations allow it since it is the only reasonable behavior. Very related/dupe: https://stackoverflow.com/questions/7269099/may-i-treat-a-2d-array-as-a-contiguous-1d-array – NathanOliver Nov 04 '19 at 17:39
  • @1201ProgramAlarm My question isn't specifically about using the 2D array as a 1D, but how it is stored in memory. As an example, if I want to make a 10x10 board, then is it better to use a 1D array or a 2D array, in terms of space requirements. – ChrisMM Nov 04 '19 at 17:42
  • @NathanOliver-ReinstateMonica That post seems to have different answers. The accepted answer seems to imply it's up to interpretation, whereas the most voted answer says its _indirectly guaranteed_, and it is not UB, but I cannot find where this is "indirectly guaranteed" in the standard. – ChrisMM Nov 04 '19 at 17:48
  • @ChrisMM The standard guarantees the arrays are contiguous and there is no padding. That's why it will work. The answer by xskxzr on that Q&A is the correct one though as you can't legally iterate from one row of the array to another one. That is dictated by the pointer addition rules. – NathanOliver Nov 04 '19 at 17:51
  • @NathanOliver-ReinstateMonica Where does it say there is no padding, especially for multiple dimensions? From what I can tell, it says that `y[i-1]` must precede `y[i]`, but this does not (to me) mean that there are not dummy bytes between `y[i-1]` and `y[i]`. From what I know, a compiler would pad `char z[5]` with 3 bytes to maintain 64-bit alignment, if the next value is a 64-bit number for example, or maybe 1 byte for a 32-bit value. Why could it not do this for multi-dimensional arrays, since they are arrays of arrays. Sorry, I am just trying to understand where there is a guaranty. – ChrisMM Nov 04 '19 at 18:03
  • @ChrisMM See: https://timsong-cpp.github.io/cppwp/dcl.array#6 – NathanOliver Nov 04 '19 at 18:10
  • @NathanOliver-ReinstateMonica Not trying to be difficult, but I don't see that as guaranteeing that there's no padding added to an array. That specifies that there's no padding between `y[i-1]` and `y[i]`, but if `y[i-1]` takes up `sizeof(char[10])+2` bytes, then there's still no additional padding between array elements, which would comply by the standard. If I have `char x[5]; int y[5]` then the compiler is free to add padding after the first array, so why can it not for the multi-dimensional array? – ChrisMM Nov 04 '19 at 18:19
  • @ChrisMM I've added an answer to hopefully clear this all up. – NathanOliver Nov 04 '19 at 18:31

1 Answers1

5

What you are looking for is found in [dcl.array]/6

An object of type “array of N U” contains a contiguously allocated non-empty set of N subobjects of type U, known as the elements of the array, and numbered 0 to N-1.

What this states is that if you have an array like int arr[10] then to have 10 int's that are contiguous in memory. This definition works recursively though so if you have

int arr[5][10]

then what you have is an array of 5 int[10] arrays. If we apply the definition from above then we know that the 5 int[10] arrays are contiguous and then int[10]'s themselves are contiguous so all 50 int's are contiguous. So yes, a 2d array look just like a 1d array in memory since really that is what they are.

This does not mean you can get a pointer to arr[0][0] and iterate to arr[4][9] with it. Per [expr.add]/4

When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P.

  • If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value.

  • Otherwise, if P points to an array element i of an array object x with n elements ([dcl.array]), the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) array element i+j of x if 0≤i+j≤n and the expression P - J points to the (possibly-hypothetical) array element i−j of x if 0≤i−j≤n.

  • Otherwise, the behavior is undefined.

What this states is that if you have a pointer to an array, then the valid indices you can add to it are [0, array_size]. So if you did

int * it = &arr[0][0]

then what it points to is the first element of the first array which means you can legally only increment it to it + 10 since that is the past then end element of the first array. Going into the second array is UB even though they are contiguous.

Community
  • 1
  • 1
NathanOliver
  • 171,901
  • 28
  • 288
  • 402
  • Sorry, I still have the one question here: does `int[10]` guarantee that it is `sizeof(int) * 10` with absolutely no padding on the end? Are padding rules only for distinct variables? – ChrisMM Nov 04 '19 at 18:38
  • @ChrisMM Yes. Arrays are not allowed to have padding. They are defined to have *contains a contiguously allocated non-empty set of N subobjects* – NathanOliver Nov 04 '19 at 18:42
  • @ChrisMM If you had `std::array` then there could be padding since `std::array` is a class type and those are allowed to have padding. – NathanOliver Nov 04 '19 at 18:43
  • Okay, I think I was confused because `char x[5]; int a;` can have padding between the two variables, and to me that padding belongs to `x`, and therefore takes more room than `sizeof( char ) * 5`. If that makes sense :) – ChrisMM Nov 04 '19 at 18:47
  • 1
    @ChrisMM That padding belongs to neither of the objects. Basically that padding belongs solely to the implementation. – NathanOliver Nov 04 '19 at 18:51