C++ efficiently pass 2D-array of unknown size by reference

Question

So I know similar questions have been asked before, but I didn't quite get what I was looking for.

I want to have a function, separated into .h and .cpp file, which can perform some calculations on a 2D-array and modify the array during the calculation. Currently my code looks something like this:

main.cpp:

int gridsize = 100; // usually in range 50-200
int **grid = ...; // define dynamic array

myFunction(grid, gridsize);

myFunction.h:

void myFunction(int **grid, int gridsize);

myFunction.cpp:

void myFunction(int **grid, int gridsize)
{
   for(int step = 0; step <= MAX_STEPS; step++)
   {
      int i = ...;
      int j = ...;
      grid[i][j] = someCalculation();
   }
}

...and currently this works as I want it to.

Now, the question is, is this the only/best way of doing it? myFunction should be applicable to different array sizes unknown at compile time and I am looking for good performance, since myFunction will be called ~10^6 times during execution and each call will perform ~10^5 array accesses, which currently corresponds to runtimes in the range of hours.

I would know how to do it with std::vector, but that appears to be roughly 1.5 times slower than my current code. I have also seen some approaches with std::array and template functions, where I run into trouble with the unknown array size (...and with known array size, again, it appears to take almost twice the time to do the calculations).

What makes you think that your current approach is not efficient? All you are passing is a pointer and an integer, that is barely any data at all. (I'd also like to know how your code with `std::vector` ended up being 1.5 times slower, if you pass by reference it should have the same performance) — UnholySheep, Mar 22 '22 at 13:19
arrays from c are passed by a pointer to the first element. They are not copied. — drescherjm, Mar 22 '22 at 13:20
A `std::vector` / `std::map` / `std::unordered_map` should do just fine for your needs. Just pass them by reference using `&` (`type& name`) as said by @UnholySheep. — The Coding Fox, Mar 22 '22 at 13:23
When you used std::vector did you reserve for the size of the array before pushing data to the vector? If not the resizing process to keep expanding the vector might explain some of the time difference. In general there is no reason that std::vector should be slower. — ttemple, Mar 22 '22 at 13:24
How are you creating this 2D array from an `int **`? An `int**` conveys no information at all. If you built your array data in contiguous memory, that makes a *huge* difference, but we don't know until you show us. If you did the naive double `for` loop, where you are allocating rows, and then in the inner loop allocating columns, then, as I said, that's naive. If the data is contiguous, a `std::vector` and an `int **` should show similar timings. — PaulMcKenzie, Mar 22 '22 at 13:24
Please, do not try to be clever with compiler when writing cpp code. Using pointers and arrays in place where clearly `std::vector` should be used. I do as well question the performance issue as @UnholySheep has mentioned. — Croolman, Mar 22 '22 at 13:24
When you are testing performance are you running the release build? I mean back in the early 2000s we had a lung segmentation algorithm that took a few seconds in Release mode and over a day in debug mode of Visual Studio. The main reason for the dramatic performance difference appeared to be memory allocations and heap corruption testing and filling in debug mode. — drescherjm, Mar 22 '22 at 13:24
Yes, if you try to be tricky with pointers attempting to beat the compiler at the optimization game, you will invariably make the code *slower*, not faster. The reason being that the compiler cannot engage the optimizer with code that has all sorts of pointer aliasing and "tricks" done. — PaulMcKenzie, Mar 22 '22 at 13:27
@UnholySheep I have only tested the performance regarding access of elements. Say, we have a ```std::vector``` and a ```int*```, both initialized with the same number of elements, then sequentially accessing N random elements was slower for the ```std::vector``` (I didn't measure the initialization). Same holds for 2D vector/array. — go_science, Mar 22 '22 at 15:20
@PaulMcKenzie In fact, I did it the naive way. Which other way is there to initialize an ```int**``` with N elements per dimension "contiguously" instead of using an ```ìnt*``` of size NxN? — go_science, Mar 22 '22 at 15:27
@go_science [See this](https://stackoverflow.com/questions/21943621/how-to-create-a-contiguous-2d-array-in-c/21944048#21944048) — PaulMcKenzie, Mar 22 '22 at 15:48

score 0 · Answer 1 · answered Mar 22 '22 at 14:41

As others have commented, you're only passing a pointer and a size. Is the array square? That is, rows and columns are the same?

What might be faster is if you made a contiguous array.

int * array = new int[width * height];

// Then if i ranges over the rows (height) and j over the columns (width):
int someValue = array[i * width + j];

You can then make that faster:

for (int row = 0; row < height; ++row) {
     int colStart = row * width;
     for (int col = 0; col < width; ++col, ++colStart) {
         int value = array[colStart];
     }
}

I don't know if that's faster, but it might beat dereferencing pointers. But remember that predictive branching will make your code a LOT faster, so be careful how you optimize.

C++ efficiently pass 2D-array of unknown size by reference

1 Answers1