12

Possible Duplicate:
Why is my program slow when looping over exactly 8192 elements?

I have been tinkering around with a program that I'm using to simply sum the elements of a 2d array. A typo led to what seem to me at least, some very strange results.

When dealing with array, matrix[SIZE][SIZE]:

for(int row = 0; row < SIZE; ++row)
    for(int col = 0; col < SIZE; ++col)
        sum1 += matrix[row][col];

Runs very quickly, however is the above line sum1... is modified:

sum2 += matrix[col][row]

As I did once on accident without realizing it, I notice that my runtime increases SIGNIFICANTLY. Why is this?

Community
  • 1
  • 1
Flexo1515
  • 1,007
  • 1
  • 10
  • 27

5 Answers5

16

This is due to caching behaviour of your program.

Arrays are just consecutive blocks of memory, so when you access [row][column] you are accessing the memory sequentially. This means the data page you are accessing is on the same page, so the access is much faster.

When you do [column][row], you aren't accessing that memory sequentially anymore, so you will end up with more cache misses, so your program runs much slower.

Oleksi
  • 12,947
  • 4
  • 56
  • 80
7

The memory locations of matrix[row][col] and matrix[row][col + 1] are adjacent.

The memory locations of matrix[row][col] and matrix[row + 1][col] are separated by SIZE amount of items.

Computers like accessing memory SEQUENTIALLY not RANDOMLY, thus the adjacent access is faster. For an analogy think hard drive performance, sequential read/write is always better than random read/write. This has to do with how your CPU caches memory and tries to predict what you'll need next.

René Höhle
  • 26,716
  • 22
  • 73
  • 82
Louis Ricci
  • 20,804
  • 5
  • 48
  • 62
  • The suggested analogy is a great deal confusing since HDD is not RAM and OP is! Caching is obviously the answer. – Mike Siomkin Dec 10 '21 at 11:43
  • Mechanical HDD sequential read is FAST because the physical read head has to move less; RAM sequential read is FAST because there are less CPU cache misses. Mechanical HDD random read is SLOW because the physical read head has to move around more; RAM random read is SLOW because there are more CPU cache misses. In this analogy, SLOW and FAST things, that are not exactly/literally the same, are compared. OP might not know how CPU cache works, so comparing to HDD Sequential == FAST & Random == SLOW might be helpful in understanding the concept. Hope this explains it for you @MikeSiomkin – Louis Ricci Dec 10 '21 at 19:26
  • I don't agree. Generating RANDOM uuids for a primary key in DBMS is usually SLOWER than using a SEQUENTIAL autoincrement integer counter. Should we also add this example here? These cases has nothing to do with clarifying WHY changing the indices ordering changes the execution time. They only state that other random things may also be slower than their sequential counterparts. What's worse they are trying to explain the effect in their own (different) way. It's confusing! – Mike Siomkin Dec 11 '21 at 17:13
3

It's because in the quicker case the CPU's memory prefetching is actually useful as you're iterating in a linear fashion. In the slow case you're jumping around the memory and so prefetching has little effect as the data is unlikely to be in the cache.

James
  • 9,064
  • 3
  • 31
  • 49
3

It depends on how the matrix is ordered. You are accessing the array either in row-major or column-major. Depending on how it is stored in memory, the speed will be different between the two

dchhetri
  • 6,926
  • 4
  • 43
  • 56
-5

2d array is just pointer to pointer. So it's look like

[*p][*p][*p]
  |   |   |
  v   v   v
 [d] [d] [d]
 |a| |a| |a|
 |t| |t| |t|
 [a] [a] [a]

So when you call data on non-main-array (what this pointers indicate on) your OS put it to CPU cache.

Mateusz
  • 690
  • 1
  • 6
  • 21
  • A 2D array is not a pointer to a pointer. An array is not a pointer, it is an array. A 2D array is an array of arrays, and if you try to pass it into a function taking a `Type **`, it will fail because it decays into a pointer to an array, not a pointer to a pointer. – chris Oct 26 '12 at 20:19
  • @chris: Ok, can you tell me why you can call `a[5]` and `a+5` or `5+a` or `5[a]` and it's the same? Or when you dynamic 2d array you type `int** ary = new int*[size];` and in loop `ary[i] = new int[size];`? Array is a block of memory and array var is pointer to firs element, so why I can't tell that array is a pointer? – Mateusz Oct 27 '12 at 05:54
  • Your first set of examples works because it **decays** into a pointer. `new[]` returns a pointer, so there's no real conflict of types there. You can prove that an array is not a pointer with a simple example: `int array[100]; int *pointer = new int[100]; std::cout << sizeof array << ' ' << sizeof pointer;` You'll notice a large difference between the two outputs, despite them both having 100 elements. – chris Oct 27 '12 at 06:03
  • @chris: And what happens when you send this array to function? It will works? Why not? And your argument is invalid because there are different implementation of sizeof. BTW why this method works only on static array? Because it's calculate during compilation. – Mateusz Oct 27 '12 at 09:40
  • If you have a `int array[4][5]` and you send it into a function, it decays into a `int (*)[5]`, which can't be converted to a `int **`. It's a pointer, used like an array, to an array of 5 ints. If you were to call the parameter `arg`, `++arg` would move it forward by one full array of 5 ints. Anyway, no matter what `sizeof` is like, it would be consistent: http://liveworkspace.org/code/4604791a54d28b8a2f3e81a870d41c75. You might find this useful: http://stackoverflow.com/questions/4810664/how-do-i-use-arrays-in-c – chris Oct 27 '12 at 16:48