Time complexity is not the same as speed. For a given size of data, a program with O(N)
might be slower, faster or the same speed as O(2N)
. Also, for a given size of data O(N)
might be slower, faster or the same speed as O(N^2)
.
So if Big-O doesn't mean anything, why are we talking about it anyway?
Big-O notation describes the behaviour a program as the size of data increases. This behaviour is always relative. In other words, Big-O tells you the shape of asymptotic curve, but not its scale or dimension.
Let's say you have a program A that is O(N)
. This means that processing time will be linearly proportional to data size (ignoring real-world complications like cache sizes that might make the run-time more like piecewise-linear):
- for 1000 rows it will take 3 seconds
- for 2000 rows it will take 6 seconds
- for 3000 rows it will take 9 seconds
And for another program B which is also O(N)
:
- for 1000 rows it will take 1 second
- for 2000 rows it will take 2 seconds
- for 3000 rows it will take 3 seconds
Obviously, the second program is 3 times faster per row, even though they both have O(N)
. Intuitively, this tells you that both programs go through every row and spend some fixed time on processing it. The difference in time from 2000 to 1000 is the same as difference from 3000 to 2000 - this means that the grows linearly, in other words time needed for one record does not depend on number of all records. This is equivalent to program doing some kind of a for
-loop, as for example when calculating a sum of numbers.
And, since the programs are different and do different things, it doesn't make any sense to compare 1 second of program A's time to 1 second of program B's time anyway. You would be comparing apples and oranges. That's why we don't care about the constant factor and we say that O(3n)
is equivalent to O(n)
.
Now imagine a third program C, which is O(N^2)
.
- for 1000 rows it will take 1 second
- for 2000 rows it will take 4 seconds
- for 3000 rows it will take 9 seconds
The difference in time here between 3000 and 2000 is bigger than difference between 2000 and 1000. The more the data, the bigger the increase. This is equivalent to a program doing a for
loop inside a for
loop - as, for example when searching for pairs in data.
When your data is small, you might not care about 1-2 seconds difference. If you compare programs A and C just from above timings and without understanding the underlying behaviour, you might be tempted to say that A is faster. But look what happens with more records:
- for 10000 rows program A will take 30 seconds
- for 10000 rows program C will take 1000 seconds
- for 20000 rows program A will take 60 seconds
- for 20000 rows program C will take 4000 seconds
Initially the same performance for the same data quickly becomes painfully obvious - by a factor of almost 100x. There is not a way in this worlds how running C on a faster CPU could ever keep up with A, and the bigger the data, the more this is true. The thing that makes all the difference is scalability. This means answering questions like how big of a machine are we going to need in 1 years' time when the database will grow to twice its size. With O(N)
, you are generally OK - you can buy more servers, more memory, use replication etc. With O(N^2)
you are generally OK up to a certain size, at which point buying any number of new machines will not be enough to solve your problems any more and you will need to find a different approach in software, or run it on massively parallel hardware such as GPU clusters. With O(2^N)
you are pretty much fucked unless you can somehow limit the maximum size of the data to something which is still useable.
Note that the above examples are theoretical and intentionally simplified; as @PeterCordes pointed out, the times on a real CPU might be different because of caching, branch misprediction, data alignment issues, vector operations and million other implementation-specific details. Please see his links in comments below.