Memory layout : 2D NM data as pointer to NM buffer or as array of N pointers to arrays

Question

I'm hesitating on how to organize the memory layout of my 2D data. Basically, what I want is an N*M 2D double array, where N ~ M are in the thousands (and are derived from user-supplied data)

The way I see it, I have 2 choices :

double *data = new double[N*M];

or

double **data = new double*[N];
for (size_t i = 0; i < N; ++i)
     data[i] = new double[M];

The first choice is what I'm leaning to. The main advantages I see are shorter new/delete syntax, continuous memory layout implies adjacent memory access at runtime if I arrange my access correctly, and possibly better performance for vectorized code (auto-vectorized or use of vector libraries such as vDSP or vecLib)

On the other hand, it seems to me that allocating a big chunk of continuous memory could fail/take more time compared to allocating a bunch of smaller ones. And the second method also has the advantage of the shorter syntax data[i][j] compared to data[i*M+j]

What would be the most common / better way to do this, mainly if I try to view it from a performance standpoint (even though those are gonna be small improvements, I'm curious to see which would more performing).

Why not create a class that wraps around a `std::vector`, overloading `operator()(size_t i, size_t j)` for access. — Yuushi, Aug 07 '14 at 01:41
Not sure why this is tagged with C, as if an answer with `malloc` and alike would be accepted or anything. — Utkan Gezer, Aug 07 '14 at 01:49
@ThoAppelsin : as stated, my concern is merely about the memory layout and performance implications and not about precise syntax. Any solution in any language that directly deals with pointer is fine with me (pseudo-code also). The C language seemed relevant in that case. Retagging... — Olotiar, Aug 07 '14 at 07:39

score 2 · Accepted Answer · answered Aug 07 '14 at 01:43

Between the first two choices, for reasonable values of M and N, I would almost certainly go with choice 1. You skip a pointer dereference, and you get nice caching if you access data in the right order.

In terms of your concerns about size, we can do some back-of-the-envelope calculations.

Since M and N are in the thousands, suppose each is 10000 as an upper bound. Then your total memory consumed is

 10000 * 10000 * sizeof(double) = 8 * 10^8

This is roughly 800 MB, which while large, is quite reasonable given the size of memory in modern day machines.

Ok, so your reasoning is the same as mine for preferring the 1st choice. Thank you for your input. — Olotiar, Aug 07 '14 at 07:42

jxh · Answer 2 · 2014-08-07T14:56:48.500

2

If N and M are constants, it is better to just statically declare the memory you need as a two dimensional array. Or, you could use std::array.

std::array<std::array<double, M>, N> data;

If only M is a constant, you could use a std::vector of std::array instead.

std::vector<std::array<double, M>> data(N);

If M is not constant, you need to perform some dynamic allocation. But, std::vector can be used to manage that memory for you, so you can create a simple wrapper around it. The wrapper below returns a row intermediate object to allow the second [] operator to actually compute the offset into the vector.

template <typename T>
class matrix {
    const size_t N;
    const size_t M;
    std::vector<T> v_;
    struct row {
        matrix &m_;
        const size_t r_;
        row (matrix &m, size_t r) : m_(m), r_(r) {}
        T & operator [] (size_t c) { return m_.v_[r_ * m_.M + c]; }
        T operator [] (size_t c) const { return m_.v_[r_ * m_.M + c]; }
    };
public:
    matrix (size_t n, size_t m) : N(n), M(m), v_(N*M) {}
    row operator [] (size_t r) { return row(*this, r); }
    const row & operator [] (size_t r) const { return row(*this, r); }
};


matrix<double> data(10,20);
data[1][2] = .5;
std::cout << data[1][2] << '\n';

In addressing your particular concern about performance: Your rationale for wanting a single memory access is correct. You should want to avoid doing new and delete yourself, however (which is something this wrapper provides), and if the data is more naturally interpreted as multi-dimensional, then showing that in the code will make the code easier to read as well.

Multiple allocations as shown in your second technique is inferior because it will take more time, but its advantage is that it may succeed more often if your system is fragmented (the free memory consists of smaller holes, and you do not have a free chunk of memory large enough to satisfy the single allocation request). But multiple allocations has another downside in that some more memory is needed to allocate space for the pointers to each row.

My suggestion provides the single allocation technique without needed to explicitly call new and delete, as the memory is managed by vector. At the same time, it allows the data to be addressed with the 2-dimensional syntax [x][y]. So it provides all the benefits of a single allocation with all the benefits of the multi-allocation, provided you have enough memory to fulfill the allocation request.

edited Aug 07 '14 at 14:56

answered Aug 07 '14 at 02:14

jxh

69,070
8
110
193

+1, although depending on the size of `M` and `N`, trying to use `std::array` (which is stack allocated) could cause problems. – Yuushi Aug 07 '14 at 02:27
`std::array` is not always "stack allocated". – aschepler Aug 07 '14 at 02:42
@aschepler: Hmmm... interesting. What exactly do you mean by that? – jxh Aug 07 '14 at 02:57
M and N are not constant. Edited my post accordingly. You merely propose a wrapper similar to @ThoApplesin but don't discuss my main point, which extend beyond syntax – Olotiar Aug 07 '14 at 07:46
@Olotiar: The wrapper is superior because the work of managing the memory is done for you, and mine is different because it is a single allocation like yours, but also provides the `[x][y]` syntax that you desired. – jxh Aug 07 '14 at 14:33
@Olotiar: A single allocation is not slower than multiple allocations, but it may "fail" if your system is fragmented in a way that there is no memory slot large enough. – jxh Aug 07 '14 at 14:35
@Olotiar: Finally, for performance, a single pointer dereference is faster than two pointer deferefernces. And cache performance is improved with small spatial locality. If you are jumping around all over your vector to random places, then you probably will have about the same number of cache misses using either allocation method. If you are mostly accessing the data in a sequential manner, then a single allocation will give you much better cache utilization. – jxh Aug 07 '14 at 14:41
@jxh "A single allocation is not slower than multiple allocations" : that's interesting, can you develop on that ? I'm mentally stuck with this probably prehistoric "free list" allocation model where it would seem that statistically, it would take more time to find big continuous free spots in memory – Olotiar Aug 07 '14 at 16:28
We first assume the system has enough memory to satisfy the single allocation request. Otherwise, the question is moot. The first call for the large allocation and the first call for the small allocation will take roughly the same amount of time, because the bulk of the expense will be system call to ask for additional resources. After that, the large allocation is done. The small allocation has to keep asking. On deallocation, the single deallocation makes one call and is done. The multiple allocation version has to make multiple calls, which are either system calls or free list updates. – jxh Aug 07 '14 at 16:32

score 0 · Answer 3 · answered Aug 07 '14 at 01:47

Consider using something like the following:

// array of pointers to doubles to point the beginning of rows
double ** data = new double*[N];

// allocate so many doubles to the first row, that it is long enough to feed them all
data[0] = new double[N * M];

// distribute pointers to individual rows as well
for (size_t i = 1; i < N; i++)
    data[i] = data[0] + i * M;

I'm not sure if this is a general practice or not, I just came up with this. Some downs still apply to this approach, but I think it eliminates most of them, like being able to access the individual doubles like data[i][j] and all.

Actually interesting technique. Would love to hear about people successfully applying something similar. But don't discuss my main point, which extend beyond syntax — Olotiar, Aug 07 '14 at 07:43

Memory layout : 2D NM data as pointer to NM buffer or as array of N pointers to arrays

3 Answers3

Linked

Memory layout : 2D N*M data as pointer to N*M buffer or as array of N pointers to arrays

3 Answers3

Linked

Memory layout : 2D NM data as pointer to NM buffer or as array of N pointers to arrays