1

Via stackoverflow threads like this one, I discovered that you could use an array of pointers to manage a 2D array. In the past I used to use pointer to pointer to store 2D arrays, but now I have a requirement to store my data in contiguous memory, so pointer to pointer format doesn't work anymore.

A raw 2D array is an alternative to array of pointers, but raw 2D array doesn't work for me because I want to be able to allocate storage in the heap and not on the stack because my container can be very large and I might encounter stack overflow if I use vanilla 2D array.

My question is about how memory is allocated for array of pointers. I use new operator like below (see my constructor) to allocate my array of pointers

#include <iostream>
using namespace std;

template<typename DataType, unsigned numRows, unsigned numCols>
class Container2D {
    public:
        Container2D() {
            m_data = new DataType[numRows][numCols];
        }
        
        ~Container2D() {
            delete [] m_data;
        }
        
        DataType* getData() { return &m_data[0][0]; }
    
    private:
        DataType (*m_data)[numCols];
    
};


int main() {
    
    Container2D<int, 3, 3> container;
    
    return 0;
}

Does new DataType[numRows][numCols] allocate the entire 2D array on the heap or does it allocate numRows pointers on the heap while allocating numCols objects of type DataType on the stack?

In a pointer to pointer scenario (where I'd define my storage as DataType** m_data), I know for a fact that both dimensions of my array are allocated on the heap and I would call delete m_data[i] for each column and then call delete[] m_data to free my row data. In the array of pointers scenario, I'm not sure if my destructor above is freeing up data correctly.

DigitalEye
  • 1,456
  • 3
  • 17
  • 26
  • 1
    The standard doesn't require implementations to have a stack or heap - but, yes, the entire 2D will (on systems using a heap) be allocated on the heap. – Ted Lyngmo May 28 '22 at 06:47
  • 2
    One way to free you self from these worries is to use a 1D contiguous array and implement the 2D index to 1D index mapping instead. – ihdv May 28 '22 at 06:53
  • 1
    No such thing as "heap" memory in standard C++ but `new`/`delete` do use heap on systems that have a heap. Instead of using a pointer and `new`/`delete`, try using a suitable standard container. `std::vector >` can be used to represent a 2D array if contiguity of rows/columns doesn't matter, and `std::vector` (with suitable mapping of indices) can be used if contiguity matters. The default allocator for `std::vector` uses `new`/`delete` (so, if you follow basic guidelines to use `std::vector` correctly) will correctly manage/release memory (e.g. avoid leaks) – Peter May 28 '22 at 07:08
  • Or use a `std::unique_ptr, numRows>>` - [example](https://godbolt.org/z/rKdPo1KEd) – Ted Lyngmo May 28 '22 at 07:16
  • Is there any reason why you don't want to use a `std::vector` to manage your memory? – Costantino Grana May 28 '22 at 07:25
  • @Peter: I wasn't aware that the standard doesn't mandate usage of heap for dynamic memory allocation. I had assumed using the new operator meant that heap was used for allocating the memory requested. On systems where heap is not used does the standard guarantee that there won't be a stack overflow? – DigitalEye May 29 '22 at 00:07
  • @TedLyngmo: array of array doesn't work because I need my data to be stored in a contiguous block of memory. My code interfaces with a library that expects a pointer to be passed to this data and will access the data by incrementing the pointer. – DigitalEye May 29 '22 at 00:09
  • @Di9gitalEye The standard doesn't use the term "heap" or "stack" in regard to memory at all. The standard is hardware neutral and the origins of heap and stack (in regard to memory, at least) are related to particular hardware architectures (heap and stack were labels applied to memory implemented using distinct electronic circuits). Not all hardware architectures have memory areas of heap or stack at all. – Peter May 29 '22 at 00:35
  • @DigitalEye I see. The data _is_ contiguous with an array of array, but it won't have the right type for what you want. – Ted Lyngmo May 29 '22 at 05:09

2 Answers2

1

One way is to use a 1D contiguous array and implement the 2D index to 1D index mapping instead:

#include <iostream>
#include <cassert>
using namespace std;

template<typename DataType, unsigned numRows, unsigned numCols>
class Container2D {
    public:
        Container2D() { m_data = new DataType[numRows * numCols]; }
        ~Container2D() { delete [] m_data; }
        inline DataType  operator()(unsigned i, unsigned j) const {
            assert( 0 <= i && i < numRows);
            assert( 0 <= j && j < numCols);
            return m_data[i*numCols+j];
        }
        inline DataType& operator()(unsigned i, unsigned j) {
            // same as above, but this allows inplace modifications
            assert( 0 <= i && i < numRows);
            assert( 0 <= j && j < numCols);
            return m_data[i*numCols+j];
        }
    private:
        DataType* m_data;
};


int main() {
    
    Container2D<int, 3, 3> container;
    int x = container(0,0);  // retrieve the element at (0,0);
    container(1,2) = 9;      // modify the element at (1,2);
    // int y = container(3,0);  // triggers assertion errors for out-of-bound indexing
    
    return 0;
}

Notes:

  • If numRows and numCols do not change for a specific class instance, new and delete are not necessary in this case. If they do dynamically change, it is better to store them as member variables instead of template parameters. If numRows and numCols are too large, one can dynamically allocate Container2D objects as a whole.
  • As @GoswinvonBrederlow commented, there is no difference between this and new m_data[numRows][numCols] in terms of memory layout, but this convention makes it easier to extend to higher dimensions.
ihdv
  • 1,927
  • 2
  • 13
  • 29
  • 1
    I'd suggest to switch from `DataType* m_data;` to `std::unique_ptr m_data;`, and/or to provide/delete copy and move constructors and assignment. – Costantino Grana May 28 '22 at 07:57
  • How is `new DataType[numRows * numCols];` better than `new DataType[numRows][numCols];`? – Goswin von Brederlow May 28 '22 at 15:44
  • @GoswinvonBrederlow: I don't think one is better than the other. The former format allocates a single memory unit of size numRows*numCols and returns a pointer to it. The latter allocates an array of numRows pointers each of which can point to block of numCols items of DataType type. – DigitalEye May 29 '22 at 00:12
  • @ihdv: "if numRows and numCols do not change for a specific class instance, new and delete are not necessary in this case." I assume you're recommending declaring m_data as ```DataType m_data[numRows * numCols] ``` if numRows and numCols are compile-time constants. However, for very large values of numRows and numCols isn't there a risk that declaring a static array might overflow the stack? – DigitalEye May 29 '22 at 00:13
  • @DigitalEye No, both allocate a single block of memory. The only difference is that in one case the compiler multiplies row * numCols for you and the other you have to do that yourself. Well, and the bound checking warnings the compiler can show for the later case. – Goswin von Brederlow May 29 '22 at 00:51
  • @ihdv: ```int m[10000][10000]``` causes a segfault on my platform where as ```Container2D``` works fine. Clearly, I risk overflowing the stack if give up new. Thoughts? – DigitalEye May 29 '22 at 03:04
  • @GoswinvonBrederlow I don't mean one is better in this specific example. But this makes it easier to extend to higher dimensions. – ihdv May 29 '22 at 04:43
  • @DigitalEye You can `new` the `Container2D` object as a whole. – ihdv May 29 '22 at 04:52
  • 1
    @DigitalEye But then you have to `delete` it later and take care of copying and moving it. Here is an idea, lets put it in a wrapper that manages that for you. – Goswin von Brederlow May 29 '22 at 11:45
1

Does new DataType[numRows][numCols] allocate the entire 2D array on the heap or does it allocate numRows pointers on the heap while allocating numCols objects of type DataType on the stack?

When you write

DataType arr[numRows][numCols];

the memory will be in one contiguos block as you mentioned. Nothing changes there when you use new. It will allocate one contiguos block of memory of the specified type. There is no hidden array of pointers into the real data.

Goswin von Brederlow
  • 11,875
  • 2
  • 24
  • 42