Does the order of nested loops matter for speed

Question

Suppose that I have two arrays arrayOne and arrayTwo where arrayOne.length != arrayTwo.length (suppose the similar case of two List that have different size()). Does any of the following offer a speed advantage over the other?

Option 1

for(int i = 0 ; i < arrayOne.length ; ++i) {
   for(int j = 0 ; j < arrayTwo.length ; ++j) {
    //Do something
   }
}

Option 2

for(int i = 0 ; i < arrayTwo.length ; ++i) {
   for(int j = 0 ; j < arrayOne.length ; ++j) {
    //Do something
   }
}

What are you doing within the loop? And which array is bigger? Processor caches can have a large impact, and ordering of loops could significantly change the cache behaviour. — Jon Skeet, Oct 21 '13 at 06:15

phuclv · Accepted Answer · 2020-04-13T16:52:17.707

7

There's no general result. It depends on your system and the various situation during runtime. Benchmark and get a result for your own. But the one that utilizes more cache locality is often faster

In case of a 2D array then iterating by line then by column is usually faster in languages that are row-major and vice versa. But that's only if you loop through every or the majority of the array. Java is neither row nor column-major so there's no method that's consistently faster for every case

In case of iterating through 2 different arrays like this then no one really knows how it behaves

See also Why does cache locality matter for array performance?

edited Apr 13 '20 at 16:52

answered Oct 21 '13 at 06:15

phuclv

37,963
15
156
475

What's 'more locality'? – user2763361 Oct 21 '13 at 06:22
4

@user2763361: Locality of access to memory - basically, if you hit the same area of memory a lot, that will be more efficient than dotting around the place. – Jon Skeet Oct 21 '13 at 06:23
4

Wow Jon Skeet reply to my message!!!!! – user2763361 Oct 21 '13 at 06:32

score 0 · Answer 2 · answered Oct 21 '13 at 06:16

0

Assuming

arrayOne.length = m
arrayTwo.length = n

Computing time complexity

for(int i = 0 ; i < arrayOne.length ; ++i) {         // O(m)
  for(int j = 0 ; j < arrayTwo.length ; ++j) {  // O(n)
     //Do something
  }
}

Time Complexity - O(m)*O(n) = O(mn)

for(int i = 0 ; i < arrayTwo.length ; ++i) {         // O(n)
  for(int j = 0 ; j < arrayOne.length ; ++j) {  // O(m)
     //Do something
  }
}

Time Complexity - O(n)*O(m) = O(nm) = O(mn)

Thus both the alternatives should take the same time assuming your "do something" takes similar amount of times in both the cases.

answered Oct 21 '13 at 06:16

Ankit Rustagi

5,539
12
39
70

4

Theoretical time complexity isn't the same as actual time taken - because you're assuming that "do something" will take the same amount of time in both cases. – Jon Skeet Oct 21 '13 at 06:17
2

This is a very naive analysis. The performance of an inner loop is often subject to things like data locality. Switching the loops of, for example, a matrix multiplication can result in a significant difference in performance. – sfstewman Oct 21 '13 at 06:20
1

if "do something" takes O(1) then this would hold. I have mentioned the assumption that i have made. But since OP does not say anything about "do something" i cant add more. – Ankit Rustagi Oct 21 '13 at 06:22
@user2743361 true, deleted my comment. – Tafari Oct 21 '13 at 06:26
2

@AnkitRustagi You're entirely ignoring prefactors. Even if "do something" is O(1), there's no guarantee that the prefactor would be the same for different orders of the loops. In fact, there are plenty of examples where it is not. Modern processors do best when you access nearby memory, because it is already in cache. Switching the loops in, say, matrix multiplication, can cause the processor to invalidate cache lines at each iteration of the inner loop. So, while your asymptotic analysis is technically correct, it ignores the reality that the constant prefactors actually matter. – sfstewman Oct 22 '13 at 02:22
The assumptions above don't relate to time taken in reality. You can do a simple benchmark and check. Even with O(1), one algorithm may contains only 10 assignments but the other has 50, still constant time complexity but much different time cost. Another example is quicksort, which has the same complexity as mergesort, but in general often work faster than merge sort in case the data is already in memory. Complexity just indicates the level of complication, not the time needed. – phuclv Jul 02 '15 at 06:12
[Why does the order of the loops affect performance when iterating over a 2D array?](http://stackoverflow.com/q/9936132/995714) [Why is there a significant difference in this C++ for loop's execution time?](http://stackoverflow.com/q/26583536/995714) This is used in [loop interchange optimization](https://en.wikipedia.org/wiki/Loop_interchange) which many compilers will automatically apply, so after optimization you might not see the difference although I don't think it's the case with Java JITers – phuclv Jul 02 '15 at 06:18

Does the order of nested loops matter for speed

2 Answers2