4

I am inspecting the behavior of two 2D arrays in C++, one allocated from the stack, and one allocated from the heap.

I create two, 2D arrays of the same shape, and populate those arrays with some data. Then I attempt to read the arrays in two different methods, the first being with the simple array index format "Arr[ROW][COLUMN]". Then I read the arrays using a pointer dereference, and I get two different results for the heap allocated array, but identical results for the stack allocated array. I am trying to understand why the results differ. I would appreciate any clarification that someone can provide. Thanks in advance.

The code I am running is below:

#include <iostream>

using namespace std;

int main(){

    int rows = 6;
    int columns = 3;

    // allocate from the stack.
    double q[rows][columns];

    // allocate from the heap.
    double ** a;
    a = new double*[rows];

    for(int i = 0; i < rows; ++i){
        a[i] = new double[columns];
    }

    // populate the arrays.
    for(int i = 0; i < rows; ++i){
        for(int j = 0; j < columns; ++j){
            a[i][j] = columns*i+j;
            q[i][j] = columns*i+j;
        }
    }

    cout << "*****************" << endl;
    cout << "Array indexing method." << endl;
    cout << "*****************" << endl;

    // print the heap allocated array using array indexing.
    for(int i = 0; i < rows; ++i){
        for(int j = 0; j < columns; ++j){
            cout << a[i][j] << '\t';
        }
        cout << endl;
    }

    cout << "*****************" << endl;

    // print the stack allocated array using array indexing.
    for(int i = 0; i < rows; ++i){
        for(int j = 0; j < columns; ++j){
            cout << q[i][j] << '\t';
        }
        cout << endl;
    }

    cout << "*****************" << endl;
    cout << "Pointer dereferencing method." << endl;
    cout << "*****************" << endl;

    // print the heap allocated array.
    for(int i = 0; i < rows; ++i){
        for(int j = 0; j < columns; ++j){
            cout << *(&a[0][0] + columns*i + j) << '\t';
        }
        cout << endl;
    }

    cout << "*****************" << endl;

    // print the stack allocated array.
    for(int i = 0; i < rows; ++i){
        for(int j = 0; j < columns; ++j){
            cout << *(&q[0][0] + columns*i + j) << '\t';
        }
        cout << endl;
    }
    cout << "*****************" << endl;

    // release the memory allocated to the heap.
    for(int i = 0; i < rows; ++i){
        delete[] a[i];
    }

    delete a;

    return 0;
}

And the results I obtain are:

*****************
Array indexing method.
*****************
0       1       2
3       4       5
6       7       8
9       10      11
12      13      14
15      16      17
*****************
0       1       2
3       4       5
6       7       8
9       10      11
12      13      14
15      16      17
*****************
Pointer dereferencing method.
*****************
0       1       2
0       3       4
5       0       6
7       8       0
9       10      11
0       12      13
*****************
0       1       2
3       4       5
6       7       8
9       10      11
12      13      14
15      16      17
*****************

And I can see that in the third block of output, the heap allocated array isn't being read properly, but the stack allocated array is.

Thanks again.

Josh M.
  • 55
  • 3
  • 3
    `double q[rows][columns];` VLAs aren't standard c++. – user0042 Sep 13 '17 at 20:46
  • This isn't the problem, but do you really need the extra stuff that `std::endl` does? `'\n'` ends a line. – Pete Becker Sep 13 '17 at 20:51
  • No, I don't, you're right for this example `\n` would have been fine. – Josh M. Sep 13 '17 at 20:55
  • It isn't related to allocating on the heap vs the stack, BTW. You could have allocated a single, contiguous block of memory on "the heap". – juanchopanza Sep 13 '17 at 20:56
  • Yep, just allocating `arr = new double[rows * columns];` would yield better results. And then to access row 3, col 2 you could do: `arr[row * columns + col]` – Carl Sep 13 '17 at 21:22
  • @JoshM. -- [Related information](https://stackoverflow.com/questions/21943621/how-to-create-a-contiguous-2d-array-in-c/21944048#21944048), if you want a contiguous 2D array and keep the `[ ][ ]` syntax. – PaulMcKenzie Sep 13 '17 at 21:59
  • @JoshM. -- Also, note that using the code at the SO link I provided, the `double **` created array shows the same results as the (non-standard) VLA version you posted, [as seen here](https://ideone.com/f7CufD). – PaulMcKenzie Sep 13 '17 at 22:05
  • @JoshM.So the real issue is *how* you're allocating from the heap. You can allocate from the heap, have contiguous data, and still use `double **`. Your attempt allocated arrays in a naive fashion -- one allocation per row. – PaulMcKenzie Sep 13 '17 at 22:22

3 Answers3

11

&q[0][0] gives you a pointer to the first double in a block containing rowsxcolumns doubles. While &a[0][0] gives you a pointer the first double in a block containing columns doubles (you've allocated it using a[0] = new double[columns];, remember?). So accessing it columns*i + j will be out of bounds and will triggers Undefined Behavior.

user7860670
  • 35,849
  • 4
  • 58
  • 84
  • 3
    Ah, so my problem is that I'm assuming that the allocation from the heap was contiguous, but it actually isn't? Presumably each row of columns is quite near the others, because in my output I am getting some of the appropriate values, just not where they should be. – Josh M. Sep 13 '17 at 20:53
  • 2
    @JoshM. Your heap allocations are not contiguous at all. You allocate 1 block containing `rows` count of pointers to double and then `rows` count of blocks containing `columns` count of doubles each. Though it makes sense that they are allocated one after another (with some padding in between) in this particular case since heap is not cluttered yet. But one should not rely on any assumptions of this kind about allocator behavior. – user7860670 Sep 13 '17 at 21:02
  • You might add this to your answer: the "pointer dereferencing method" invokes UB even on the stack-allocated array since it indexes past the bounds of the first subarray. Even though memory is contiguous, this is still illegal. – Todd Fleming Sep 14 '17 at 00:19
3

Arrays are usually a single allocation, on the stack in this case. The array value q[1][0] is being positioned right after q[0][columns-1]. The pointer implementation is allocating a disjoint block of memory for each row from the heap. By referencing beyond the end of the first row, you are in undefined behavior territory.

Khouri Giordano
  • 1,426
  • 15
  • 17
2

The stack based array is fully contiguous memory. Your heap based one is one allocation per row, plus an allocation to hold all those allocations. The memory for each allocation may be scattered throughout your heap in any order, with any amount of space between them. You cannot assume that adding offsets to the base of the first row's allocation is valid if you go past its end, and it's certainly a bug to try to reach a separate allocation by walking far enough off the end of a previous allocation.

If you want to do what you're trying, change your new[] expression to be flattened into a single array:

double * q = new [rows * columns];

Then you can index into it as if it is a 2d array.

Also, in your original post you use the wrong type of delete call on a, which is also an array and needs to be deleted as one (with [] after it.)

Chris Uzdavinis
  • 6,022
  • 9
  • 16