multi dimensional array allocation in chunks

Question

I was poking around with multidimensional arrays today, and i came across blog which distinguishes rectangular arrays, and jagged arrays; usually i would do this on both jagged and rectangular:

Object** obj = new Obj*[5];
for (int i = 0; i < 5; ++i)
{
   obj[i] = new Obj[10];
}

but in that blog it was said that if i knew that the 2d array was rectangular then i'm better off allocating the entire thing in a 1d array and use an improvised way of accessing the elements, something like this:

Object* obj = new Obj[rows * cols];
obj[x * cols + y]; 
//which should have been obj[x][y] on the previous implementation

I somehow have a clue that allocating a continuous memory chunk would be good, but i don't really understand how big of a difference this would make, can somebody explain?

You can have locality issues with separate allocations for each row (i.e. the data is not stored "close to each other" so that iterating the elements needs more loads from memory than with a one-chunk solution). In practice that shouldn't matter that much if you allocate early in the program (so that the blocks are allocated in successive chunks and not scattered around in previously freed places). — Peter - Reinstate Monica, Apr 18 '16 at 08:29
This question may be closed because it might not be a good fit for Stackoverflow. That being said, it probably does not matter what you use at this point. If you are making a performance heavy app (like a commercial game, or some scientific algorithm that takes days to finish) then a performance question like this is relevant. Actually, you should see the performance difference for yourself. Make two test cases that are relevant to you, and compare their results. — Dialecticus, Apr 18 '16 at 08:31
yeah you're probably right, I've read something about "memory holes" back then. but i'm a little worried it's effects in the long run. this isn't just about the performance part, i was just checking for overlooked implications. — Epsilon_, Apr 18 '16 at 08:35

AhmadWabbi · Answer 1 · 2016-04-18T08:31:25.830

0

Allocating a 2D array as a one big chunk permits the compiler to generate a more efficient code than doing it in multiple chunks. At least, there would be one pointer dereferencing operation in one chunk approach. BTW, declaring the 2D array like this:

Object obj[rows][cols];
obj[x][y];

is equivalent to:

Object* obj = new Obj[rows * cols];
obj[x * cols + y];

in terms of speed. But the first one in not dynamic (you need to specify the values of "rows" and "cols" at compile time.

edited Apr 18 '16 at 08:31

answered Apr 18 '16 at 08:25

AhmadWabbi

2,253
1
20
35

I see, but I'm a little worried about the element access part `obj[x * cols + y]` i mean it's gonna get used in a loop and that's a multiplication in there. – Epsilon_ Apr 18 '16 at 08:31
You can always precompute your row pointer. Or you can maintain an array of row pointers. – paddy Apr 18 '16 at 08:35
This is little calculation to perform when you simulate a 2D array using a 1D array. See [link](http://stackoverflow.com/questions/2151084/map-a-2d-array-onto-a-1d-array-c) for example. – AhmadWabbi Apr 18 '16 at 08:35
I think that the compiler secretly creates the same multiplication if you use a true 2-dimensional array :-). Integer operations are very fast compared to memory access. Locality will be more important than an additional multiplication. – Peter - Reinstate Monica Apr 18 '16 at 09:33
Oh, and also important is that `Object obj[rows][cols];` is a local or global variable and takes virtually zero time to allocate. A dynamic allocation like `Object* obj = new Obj[rows * cols];`, by contrast, takes significant time. That tends to be important in a tight loop, and for small arrays: for large arrays the run-time behavior of your program will probably be dominated by the many operations you perform on the many array elements. – Peter - Reinstate Monica Apr 18 '16 at 10:01
IMHO, locality is not an issue here. It is true that one-chunk 2D array has better Spatial locality, but there is also Temporal locality, and caches take advantage of that too. It is a matter of one (or few) cache misses. In my graduation project, I had a relatively big 2D array that had to be initialized and traversed many times (for each character in an OCR program). I can tell you that this additional multiplication operation counted a lot! – AhmadWabbi Apr 18 '16 at 11:16

Gareth McCaughan · Accepted Answer · 2016-04-18T08:48:26.303

First and less important, when you allocate and free your object you only need to do a single allocation/deallocation.

More important: when you use the array you basically get to trade a multiplication against a memory access. On modern computers, memory access is much much much slower than arithmetic.

That's a bit of a lie, because much of the slowness of memory accesses gets hidden by caches -- regions of memory that are being accessed frequently get stored in fast memory inside, or very near to, the CPU and can be accessed faster. But these caches are of limited size, so (1) if your array isn't being used all the time then the row pointers may not be in the cache and (2) if it is being used all the time then they may be taking up space that could otherwise be used by something else.

Exactly how it works out will depend on the details of your code, though. In many cases it will make no discernible difference one way or the other to the speed of your program. You could try it both ways and benchmark.

[EDITED to add, after being reminded of it by Peter Schneider's comment:] Also, if you allocate each row separately they may end up all being in different parts of memory, which may make your caches a bit less effective -- data gets pulled into cache in chunks, and if you often go from the end of one row to the start of the next then you'll benefit from that. But this is a subtle one; in some cases having your rows equally spaced in memory may actually make the cache perform worse, and if you allocate several rows in succession they may well end up (almost) next to one another in memory anyway, and in any case it probably doesn't matter much unless your rows are quite short.

score 0 · Answer 3 · answered Apr 18 '16 at 08:33

By having one large contiguous chunk of memory, you may get improved performance because there is more chance that memory accesses are already in the cache. This idea is called cache locality. We say the large array has better cache locality. Modern processors have several levels of cache. The lowest level is generally the smallest and the fastest.

It still pays to access the array in meaningful ways. For example, if data is stored in row-major order and you access it in column-major order, you are scattering your memory accesses. At certain sizes, this access pattern will negate the advantages of caching.

Having good cache performance is far preferable to any concerns you may have about multiplying values for indexing.

Peter - Reinstate Monica · Answer 4 · 2016-04-18T12:05:17.483

If one of the dimensions of your array is a compile time constant you can allocate a "truly 2-dimensional array" in one chunk dynamically as well and then index it the usual way. Like all dynamic allocations of arrays, new returns a pointer to the element type. In this case of a 2-dimensional array the elements are in turn arrays -- 1-dimensional arrays. The syntax of the resulting element pointer is a bit cumbersome, mostly because the dereferencing operator*() has a lower precedence than the indexing operator[](). One possible allocation statement could be int (*arr7x11)[11] = new int[7][11];.

Below is a complete example. As you see, the innermost index in the allocation can be a run-time value; it determines the number of elements in the allocated array. The other indices determine the element type (and hence element size as well as overall size) of the dynamically allocated array, which of course must be known to perform the allocation. As discussed above, the elements are themselves arrays, here 1-dimensional arrays of 11 ints.

#include<cstdio>
using namespace std;

int main(int argc, char **argv)
{
    constexpr int cols = 11;
    int rows = 7;

    // overwrite with cmd line arg if present.
    // if scanf fails, default is retained.
    if(argc >= 2) { sscanf(argv[1], "%d", &rows); }

    // The actual allocation of "rows" elements of 
    // type "array of 'cols' ints". Note the brackets
    // around *arr7x11 in order to force operator
    // evaluation order. arr7x11 is a pointer to array,
    // not an array of pointers.
    int (*arr7x11)[cols] = new int[rows][cols];

    for(int row = 0; row<rows; row++)
    {
        for(int col = 0; col<cols; col++)
        {
            arr7x11[row][col] = (row+1)*1000 + col+1;
        }
    }

    for(int row = 0; row<rows; row++)
    {
        for(int col = 0; col<cols; col++)
        {
            printf("%6d", arr7x11[row][col]);
        }
        putchar('\n');
    }

    return 0;
}

A sample session:

 g++ -std=c++14 -Wall -o 2darrdecl 2darrdecl.cpp && ./2darrdecl 3
  1001  1002  1003  1004  1005  1006  1007  1008  1009  1010  1011
  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011
  3001  3002  3003  3004  3005  3006  3007  3008  3009  3010  3011

multi dimensional array allocation in chunks

4 Answers4