Performance of differences 1-dimensional array mapping way issue

Question

Now I'm tying to follow the below subject

Performance of 2-dimensional array vs 1-dimensional array

especially, when I assign in my code.cpp code.

actually the below method terribly slow then just mapping

1

int getIndex(int row, int col) return row*NCOLS+col;

#define NROWS 10
#define NCOLS 20
This:

int main(int argc, char *argv[]) {
    int myArr[NROWS*NCOLS];
    for (int i=0; i<NROWS; ++i) {
       for (int j=0; j<NCOLS; ++j) {
          myArr[getIndex(i,j)] = i+j;
       }
    }
    return 0;
}

than

2

#define NROWS 10
#define NCOLS 20
This:

int main(int argc, char *argv[]) {
    int myArr[NROWS*NCOLS];
    for (int i=0; i<NROWS; ++i) {
       for (int j=0; j<NCOLS; ++j) {
          myArr[row*NCOLS+col] = i+j;
       }
    }
    return 0;
}

But i can't understand it why does '1' slower than '2'?

in experimentally, '1' is slower alomst twice time than '2'. I think this is not makes sense.

Because of unnecessary function calls? Also, both arrays have only one dimension. Even more, there are no N-D arrays in C, only 1-D treated in various ways. — ForceBru, May 18 '17 at 08:21
`myArr[getIndex(i,j)] = i+j;` this is inside two loops, meaning that `getIndex`is called a lot of times. — Badda, May 18 '17 at 08:23
@RichardHodges no, I didn't compile with optimization enable. — amel, May 18 '17 at 08:23
What optimisation level? And the array size is tiny, how did you test the performance? — Richard Critten, May 18 '17 at 08:23
Profiling without enabling optimisations is a waste of time (in more ways than one...). — Oliver Charlesworth, May 18 '17 at 08:25
I'm not able to reproduce the issue you claim to have, as neither of these pieces of code *compile* let alone *run*... — autistic, May 18 '17 at 08:26
How did you mesure the execution time? Can you provide a [mcve]? — Jabberwocky, May 18 '17 at 08:29
What are you trying to accomplish? IAW, what's the application for which you comparing the speed of these two implementations? — meaning-matters, May 18 '17 at 08:36
@amel if you didn't compile with optimisation enabled, look at the generated assembly code. — Jabberwocky, May 18 '17 at 08:46
Interestingly, in my experiment, one version is redirected to another version with a `jmp` instruction. https://godbolt.org/g/Z8HUae — , May 18 '17 at 08:52
Please edit your post with: 1) information about which compiler that is used and 2) the compiler and optimization options you have enabled. — Lundin, May 18 '17 at 09:17
@NickyC that's because of the -Os option which minimizes the size of the generated code. With no optimisation (-O0 vs -O4) a pretty big `getsize` function is actually created and called, which may explain the difference in terms of execution time. — Jabberwocky, May 18 '17 at 09:23

CookedCthulhu · Answer 1 · 2017-05-18T12:30:55.323

Because you didn't enable optimizations. getIndex() is small enough to be (almost certainly) inlined. Just enabling Release Mode on Visual Studio made the "slow" version so fast that I wasn't able to make the array big enough to measure the time without running into a stack overflow. ~~Accessing an array on the heap would distort the test results, so that's not an option.~~ Apart from that, you didn't use a 2D array in your code, it would look like this: int myArr[NROWS][NCOLS].
Simple math (like i + j) will most likely not be a bottleneck in your code either. If it becomes one, you should look for new algorithms first. For example: do you really need to iterate through the entire array or would other data types, which don't access the array by index, be more fitting? There are very few cases where micro-optimizations like this are really necessary. Probably never, if your array has a size of 10*20 elements.

Go for readability, finish your program, profile it, then deceide if that loop really needs optimization.

Why would a heap array change the results? Just allocate it before you start the stopwatch :) — Quentin, May 18 '17 at 10:31
You're right, only (de-)allocation should make a difference. Changed it. — CookedCthulhu, May 18 '17 at 12:31

Jaggesher Mondal · Answer 2 · 2017-05-18T08:34:47.930

0

Cause in first example you use a function and in second example you make it inline. You may know that when a program call a function then it save its current state, where it came back again. For this works it need a little clock cycle.

So according to your code. Your first example use a little bit clock cycle by calling function, than second example. For this reason second one might be faster than first one.

Here you find similar logic: http://www.cplusplus.com/forum/articles/20600/

edited May 18 '17 at 08:34

answered May 18 '17 at 08:28

Jaggesher Mondal

193
2
10

You are equating a high-level concept (a function call) with low-level speculation (whether the function's code will be inlined or not). Profiling non-optimized builds is pointless -- and the article you linked is chock full of mistakes. – Quentin May 18 '17 at 10:35

Performance of differences 1-dimensional array mapping way issue

1

2

2 Answers2