62

Assume I have two algorithms:

for (int i = 0; i < n; i++) {
  for (int j = 0; j < n; j++) {
    //do something in constant time
  }
}

This is naturally O(n^2). Suppose I also have:

for (int i = 0; i < 100; i++) {
  for (int j = 0; j < n; j++) {
    //do something in constant time
  }
}

This is O(n) + O(n) + O(n) + O(n) + ... O(n) + = O(n)

It seems that even though my second algorithm is O(n), it will take longer. Can someone expand on this? I bring it up because I often see algorithms where they will, for example, perform a sorting step first or something like that, and when determining total complexity, its just the most complex element that bounds the algorithm.

AbcAeffchen
  • 14,400
  • 15
  • 47
  • 66
Brian
  • 7,098
  • 15
  • 56
  • 73
  • 5
    An algorithm that belongs to O(n) can take up to `c*n` steps to finish, where `c` is some constant. So your algorithm is still O(n) for any number of constant iterations you make in the first loop. – ChronoTrigger Mar 23 '14 at 17:18
  • 49
    Your understanding is correct: Big-O analysis isn't used to determine absolute resource usage, but to compare how the resource usage grows with the input size. – amon Mar 23 '14 at 17:18
  • 3
    Why do you think that the second algorithm will take longer? As soon as `n` > 100 then the first algorithm will take longer. – aquinas Mar 23 '14 at 17:20
  • 2
    See also: [galactic algorithms](http://rjlipton.wordpress.com/2010/10/23/galactic-algorithms/) – harold Mar 23 '14 at 17:22
  • 1
    All very good answers everyone. I knew it wasn't supposed to explicitly cover runtimes, but these answers help put this dilemma in perspective. @aquinas You are right. I thought of this after posting lol. – Brian Mar 23 '14 at 17:23
  • Big O also tends to talk about the limit as n goes to infinity, or more realistically, just very large N. In the example you posted, if n is a billion, then the first example will take much longer than the second. We tend not to really care about big O complexity if n is very small, for exactly the reason you listed: if n is less than, lets say, 10, then others factors will almost always dominate the running time of the algorithm, not the N^2 part – Elliott Mar 27 '14 at 20:37

6 Answers6

101

Asymptotic complexity (which is what both big-O and big-Theta represent) completely ignores the constant factors involved - it's only intended to give an indication of how running time will change as the size of the input gets larger.

So it's certainly possible that an Θ(n) algorithm can take longer than an Θ(n2) one for some given n - which n this will happen for will really depend on the algorithms involved - for your specific example, this will be the case for n < 100, ignoring the possibility of optimizations differing between the two.

For any two given algorithms taking Θ(n) and Θ(n2) time respectively, what you're likely to see is that either:

  • The Θ(n) algorithm is slower when n is small, then the Θ(n2) one becomes slower as n increases
    (which happens if the Θ(n) one is more complex, i.e. has higher constant factors), or
  • The Θ(n2) one is always slower.

Although it's certainly possible that the Θ(n) algorithm can be slower, then the Θ(n2) one, then the Θ(n) one again, and so on as n increases, until n gets very large, from which point onwards the Θ(n2) one will always be slower, although it's greatly unlikely to happen.

In slightly more mathematical terms:

Let's say the Θ(n2) algorithm takes cn2 operations for some c.

And the Θ(n) algorithm takes dn operations for some d.

This is in line with the formal definition since we can assume this holds for n greater than 0 (i.e. for all n) and that the two functions between which the running time is lies is the same.

In line with your example, if you were to say c = 1 and d = 100, then the Θ(n) algorithm would be slower until n = 100, at which point the Θ(n2) algorithm would become slower.

(courtesy of WolframAlpha).

Notation note:

Technically big-O is only an upper bound, meaning you can say an O(1) algorithm (or really any algorithm taking O(n2) or less time) takes O(n2) as well. Thus I instead used big-Theta (Θ) notation, which is just a tight bound. See the formal definitions for more information.

Big-O is often informally treated as or taught to be a tight bound, so you may already have been essentially using big-Theta without knowing it.

If we're talking about an upper bound only (as per the formal definition of big-O), that would rather be an "anything goes" situation - the O(n) one can be faster, the O(n2) one can be faster or they can take the same amount of time (asymptotically) - one usually can't make particularly meaningful conclusions with regard to comparing the big-O of algorithms, one can only say that, given a big-O of some algorithm, that that algorithm won't take any longer than that amount of time (asymptotically).

Community
  • 1
  • 1
Bernhard Barker
  • 54,589
  • 14
  • 104
  • 138
  • 14
    As a concrete example, Insertion Sort (O(n^2)) is faster than Quicksort (O(nlogn)) for small inputs, and [is often used as a "base case" for small inputs](http://en.wikipedia.org/wiki/Quicksort#Optimizations). Note also that insertion sort is an interesting example of where naive asymptotic analysis can fail. It runs in O(n^2) in the worst case, but O(n) time for "mostly" sorted lists. This is because it actually runs in O(# of inversions). – Alexis Beingessner Mar 24 '14 at 02:31
  • 3
    @Raphael: In programming, [`O(n)` often used as Θ(n)](http://en.wikipedia.org/wiki/Big_O_notation#Use_in_computer_science) – jfs Mar 24 '14 at 13:33
  • 5
    @J.F.Sebastian Then it's used in the wrong way, creating unnecessary barriers between scientists and practitioners. Just use Θ as it's usually what you (i.e. programmers) want (you *should* want more precise statements, but that's for another time). (The Wikipedia article suggests that this practice is wide-spread in CS. While unfortunately true, that does not make it correct: typically people who make this error are a) not aware of the facts (bad) or b) sloppy (worse).) – Raphael Mar 24 '14 at 14:00
  • @Raphael Part of the problem is that `Θ` isn't on the keyboard, and (as demonstrated in this question and answer) the entire notation isn't as practical as it could be, due to ignoring constants. Using `O` is just being flexible when you need a rough guideline. – Izkata Mar 24 '14 at 15:26
  • 1
    @Izkata The keyboard issue is a weak excuse. It's also why sites that do care about precision use MathJax. ;) (This question should probably have been migrated to [cs.SE]). Using a "rough guideline" is fair, but it's usually a bad idea to use a symbol that has precise meaning in a closely related discipline for basically the same thing, but construe the meaning slightly. Because then, programmers (by "socialisation") start reading CS material and things become ugly. So at least folks writing documentation or SO answers on these topics should take the time to copy-pase a Θ from somewhere. – Raphael Mar 24 '14 at 16:24
  • +1 for the graph at the end. If moved up, it would serve as a TLDR summary IMO. – Dan Is Fiddling By Firelight Mar 24 '14 at 17:34
  • 1
    @DanNeely But then people wouldn't be forced to read the theory that is so essential for understanding Big-O :P – Niklas B. Mar 24 '14 at 20:54
  • 2
    @Raphael It's really not particularly important, and you're dealing with semantics. A programmer wants to be able to say "my algorithm is O(whatever)." To him this means, my algorithm can be no slower than this, which is what's important. Especially since requirements may often be in O format, simply because they say "We need at LEAST this, so prove that it's O(f(n)) and we're happy". Proving theta is often superfluous. – Cruncher Mar 24 '14 at 20:56
  • @Cruncher Agreed. No point in giving a looser upper bound than you can prove, so typically O(f(n)) is already a tight bound – Niklas B. Mar 24 '14 at 20:57
  • 1
    @Cruncher If O is only used in the way you state, it's fine. Trouble is, it's *also* used to express lower bounds, see e.g. the original form of this answer (and many other examples). As soon as you start comparing algorithms and say stuff like "This algorithm is O(n²) and therefore worse than this O(n) algorithm", you do not just not make a valid point, you might also miss the inverse fact along the way! (As a concrete example, consider "Mergesort is O(n³) and therefore worse than Bubblesort which is in O(n²)". And please don't "nobody'll say that" me -- I have graded algorithms exams.) – Raphael Mar 24 '14 at 21:03
  • "Although it's certainly possible that the Θ(n) algorithm can be slower, then the Θ(n²) one, then the Θ(n) one again, and so on as n increases, until n gets very large, from which point onwards the Θ(n²) one will always be slower, although it's greatly unlikely to happen." I'd like an example for that. Given that running times are strictly positive and generally monotonously increasing with n, this should not happen. The only thing that could make that happen is if an algorithm's complexity does not increase monotonously. – G. Bach Mar 24 '14 at 21:33
  • @G.Bach I doubt there's a well-known / useful algorithm exhibiting such behaviour, but it's trivial to construct one - let's say the `Θ(n²)` does ... well, nothing when `n < 1000000` and `n % 1000 < 500`, and insertion-sort otherwise. This should alternate between faster and slower than a simple `Θ(n)` linear search. – Bernhard Barker Mar 25 '14 at 01:20
  • I suppose I misphrased my question, I was looking for some natural algorithm that exhibits such behaviour, not a constructed one. But oh well. – G. Bach Mar 25 '14 at 01:37
  • @G.Bach: It occurs to me that there's many (unoptimized especially) algorithms which are very fast for powers-of-two, and very slow for powers-of-two-minus-one. Those would definitely not be strictly increasing. I wrote a sort a while back that I later discovered would quickly sort an even number of elements, and if there was one left (ie if n was odd), it would then effectively insert-sort that last element. I didn't notice for a while because it was accurate for all n, but I only tested performance for _even_ values of n. – Mooing Duck Jun 03 '14 at 00:08
19

Yes, an O(n) algorithm can exceed an O(n2) algorithm in terms of running time. This happens when the constant factor (that we omit in the big O notation) is large. For example, in your code above, the O(n) algorithm will have a large constant factor. So, it will perform worse than an algorithm that runs in O(n2) for n < 10.

Here, n=100 is the cross-over point. So when a task can be performed in both O(n) and in O(n2) and the constant factor of the linear algorithm is more than that of a quadratic algorithm, then we tend to prefer the algorithm with the worse running time. For example, when sorting an array, we switch to insertion sort for smaller arrays, even when merge sort or quick sort run asymptotically faster. This is because insertion sort has a smaller constant factor than merge/quick sort and will run faster.

Nikunj Banka
  • 11,117
  • 16
  • 74
  • 112
  • Thanks Nikunj. Also a great answer. I just accepted the one that came in first. – Brian Mar 23 '14 at 17:31
  • 3
    It's fine. My first answer on stackoverflow that I wrote within a few minutes of the question being posted. Guess I need to improve my typing speed. – Nikunj Banka Mar 23 '14 at 17:33
  • 1
    Good answer, but shouldn't n = 100 be the crossover point in the example provided by the OP? – Derek W Mar 23 '14 at 18:03
  • 100^2 = 10000 while 10*100 = 1000. So when n=100 then the linear algorithm is faster. You can also verify this from the graph in @Dukeling 's answer in which the two graphs cross at n=10 and hence 10 is the crossover point. – Nikunj Banka Mar 23 '14 at 18:05
  • 2
    Big-O only gives a rough idea, and determines the asymptotic behaviour. If the difference is smaller, say O (n) vs. O (n log n), the constants are much more important - one program taking 100n nanoseconds will in practice never be faster than one taking (n ln n) nanoseconds (if you calculate the crossover point, it will be some million years in the future). – gnasher729 Mar 24 '14 at 01:15
  • @NikunjBanka The argument in your comment above is not entirely correct; I think you forgot the constant factors. O(n) means "for some c, it takes time c*n"; O(n*n) means "for some d, it takes time d*n*n". But if all we have is "O(n)" and "O(n*n)", we don't know what c and d are, and we can't solve the equation "c*n < d*n*n" for n. See accepted answer. Your comment seems to assume c=d=1, but OPs very nice example has constant factors of 1 (O(n*n) algorithm) and 100 (O(n) algorithm) respectively. – Søren Debois Mar 24 '14 at 06:25
  • 1
    @SørenDebois I have not forgotten the notations. The question has been edited. Earlier the OP used 10 iterations and now it has been changed to 100. The accepted answer has also been edited accordingly as you can see just 1 hour ago. I will make the necessary changes. – Nikunj Banka Mar 24 '14 at 09:33
  • @NikunjBanka I see. Sorry about that! I amended my own answer. – Søren Debois Mar 24 '14 at 13:47
17

Big O(n) are not meant to compare relative speed of different algorithm. They are meant to measure how fast the running time increase when the size of input increase. For example,

  • O(n) means that if n multiplied by 1000, then the running time is roughly multiplied by 1000.
  • O(n^2) means that if n is multiplied by 1000, then the running is roughly multiplied by 1000000.

So when n is large enough, any O(n) algorithm will beat a O(n^2) algorithm. It doesn't mean that anything for a fixed n.

hivert
  • 10,579
  • 3
  • 31
  • 56
5

Long story short, yes, it can. The definition of O is base on the fact that O(f(x)) < O(g(x)) implies that g(x) will definitively take more time to run than f(x) given a big enough x.

For example, is a known fact that for small values merge sort is outperformed by insertion sort ( if I remember correctly, that should hold true for n smaller than 31)

Save
  • 11,450
  • 1
  • 18
  • 23
3

Yes. The O() means only asymptotic complexity. The linear algorythm can be slower as the quadratic, if it has same enough large linear slowing constant (f.e. if the core of the loop is running 10-times longer, it will be slower as its quadratic version).

The O()-notation is only an extrapolation, although a quite good one.

peterh
  • 11,875
  • 18
  • 85
  • 108
  • 2
    Something that hasn't been discussed yet is that a given algorithm might be blazingly fast once it gets going, with a small constant multiplier, **but** it has a huge setup time (constant, regardless of _n_). For very large _n_'s, the setup time becomes insignificant, but for small _n_'s, it could mean running more slowly than an algorithm with a "slower" O. – Phil Perry Mar 24 '14 at 18:45
3

The only guarantee you get is that—no matter the constant factors—for big enough n, the O(n) algorithm will spend fewer operations than the O(n^2) one.

As an example, let's count the operations in the OPs neat example. His two algoriths differ in only one line:

for (int i = 0; i < n; i++) {       (* A, the O(n*n) algorithm. *)

vs.

for (int i = 0; i < 100; i++) {     (* B, the O(n) algorithm. *)

Since the rest of his programs are the same, the difference in actual running times will be decided by these two lines.

  • For n=100, both lines do 100 iterations, so A and B perform exactly the same at n=100.
  • For n<100, say, n=10, A does only 10 iterations, whereas B does 100. Clearly A is faster.
  • For n>100, say, n=1000. Now the loop of A does 1000 iterations, whereas, the B loop still does its fixed 100 iterations. Clearly A is slower.

Of course, how big n has to get for the O(n) algorithm to be faster depends on the constant factor. If you change the constant 100 to 1000 in B, then the cutoff also changes to 1000.

Søren Debois
  • 5,598
  • 26
  • 48