0

How can I learn if an algorithm runs in O(n) time, or O(nlogn)?

Algorithmist
  • 6,657
  • 7
  • 35
  • 49

3 Answers3

2

Well here goes my first answer...

O() notation of an algorithm (though it could apply to any function) is somewhat like an upperbound on the algorithms growth rate. The case you are probably concerned with is the function that relates the number of operations your algorithm does for a given input size. You seem to be asking for a step by step approach so I'll attempt one:

  1. Write a function that describes the worst case number of operations performed by your algorithm for an input size of n (or an existing data structure of n elements, etc).

  2. Simplify that function using the definition of Big-O notation. Most of the time this is pretty straight forward; however, some occasions require understanding the mathematical formalism and what big-O really means. I won't go in to that here as it's a better job for a text book or your CS professor to explain. Ignoring the details, you keep the term (if it's a sum of terms) in the function that grows the largest and get rid of everything else. If that term has any constant coefficients, get rid of them as well. For example: f(n) = 4n^2 + 10000n + 500 would be modified in previously described manner to become n^2 as the 4n^2 term grows faster than the other two terms and the coefficient of 4 is dropped; leaving n^2.

Here's an example: Lets say List is an array of numbers and lSize is the number of items in the array.

for (int i=0; i < lSize; ++i){
  for (int j = i+1; j<=lSize; ++j){
    if (List[i] > List[j]){ //swap the elements
      int temp = List[i];
      List[i] = List[j];
      List[j] = temp;
    }
   }
}

Let's count the number operations required to sort a list of n items using the Selection Sort algorithm shown above.

Step 1

For a list size of n, how many operations does this algorithm require to sort (in terms of n)? Here lSize takes on the value of n. We have two for loops, one nested inside the other. The outer loop is easy, it executes once for every element in the List; or n times. Inside that loop is another for loop who's number of operations isn't immediately as obvious. This loops number of iterations depends on the current value of i in the outer for loop. If you sit down with some test values and write out the series that emerges you'll see that the inner loop first runs n times then (n-1) times then (n-2) times and so on, the total count described: n + (n-1) + (n-2) ... 2 + 1. You'll notice that this is the sum of all the numbers from 1 to n. If n = 100 then the number of total iterations of the inner loop would be 100 + 99 + 98 + 97 ... 2 + 1. This is an arithmetic series and can be simplified for n items to be: n*(n-1)/2. That is how many times the operations inside the inner loop get evaluated. Since there's 3 operations inside the inner loop that means the total number of operations for n items is 3 * n*(n-1)/2. Rewritten in a form easier for the next part, (3/2)n^2-(3/2)n.

Step 2

We now have our function. Where f() is the number of operations performed by our algorithm f(n) = (3/2)n^2-(3/2)n. Obviously, the fastest growing term is the (3/2)n^2 so we drop the (3/2)n. This leaves (3/2)n^2 for which we get rid of the (3/2) coefficient. This leaves n^2. Therefor, we can say f(n) is O(n^2).

This is basically the recipe for describing an algorithm in O() notation. This neglects the important whys and describes a brainless recipe for producing an answer. Why summarize algorithms this way? Why is ok to just forget about a term or coefficient? The answers to those questions are available in a more rigorous textbook description.

I hope I didn't make any terrible errors, it's a bit late.

1

Complexity depends on what your input is and which variables your algorithm uses to compute the output.

A few simple examples to help you understand basic algorithm complexity:

  • The following is O(n)

    // This is O(n)
    for (int i = 0; i < array.length; i++) {
      sum += array[i];
    }
    
    // This is O(n)
    for (int i = 0; i < array.length / 2; i++) {
      sum += array[i];
    }
    
    // This is O(n)
    for (int i = 0; i < array.length / 200; i++) {
      sum += array[i];
    }
    

    because the number of iterations you do always depend on the size of the array (which is N here).

  • But this is O(1):

    // This is O(1)
    array[i]; // where i is an int
    
    // This is O(1)
    for (int i = 0; i < 10000; i++) {
      sum += array[i % array.length];
    }
    

    whatever the size of the array is: 1 or 10,000 or 10 teras...

  • This is O(n^2):

    // O(n) that contain O(n) operation => O(n^2)
    for (int i = 0; i < matrix.length; i++) {      
      for (int j = 0; j < matrix[i].length; j++) { // O(n)
        sum += matrix[i][j];
      }
    }
    

    assuming this is a square matrix (otherwise the inner loop would be running in O(m))

There lots of famous algorithms to study to know how to compute algorithm complexity well. For O(log(n)) complexity for example, take a look at: http://en.wikipedia.org/wiki/Binary_search_algorithm

When it comes to evaluate real world algorithm, the complexity of some operation may be hidden in the representation you use to store your values. For example, in Java, you have ArrayList (a) and LinkedList (ll) which can used with the interface List (l). So the following line:

l.get(i); // where i is int and l is List.

may take O(1) time or O(n) time depending on the real structure of the list.\

You can read this pages for more information: http://en.wikipedia.org/wiki/Analysis_of_algorithms http://en.wikipedia.org/wiki/Computational_complexity_theory Algorithm Complexity Time

Community
  • 1
  • 1
fabien
  • 1,529
  • 1
  • 15
  • 28
  • 1
    You missed `O(nlogn)` – P0W Aug 25 '13 at 05:52
  • I would say your second example is `O(N)` as well, as the innermost statement is executed `O(N)` times. `N` here is of course `min(array.length, 10000).` – user207421 Aug 25 '13 at 05:56
  • Second example is often confusing. But if you prefer, you can fix the number of hard coded iteration to 2 and the size of the array to 10 teras. Or do the opposite: array of size 2 and 10 teras iterations. In all cases, the number of operations the loop does is independant from N (the array's length), hence O(1). You can always argue that 10,000 can also be equal to N sometimes or be close enough, but since it does not depend on N, the complexity of this "algorithm" does not depend on N either. If you prefer: the 'exact' complexity of this loop is O(10000) = O(1). – fabien Aug 25 '13 at 06:00
  • The number of operations the loop depends on the value of `array.length` for N < 10000. – user207421 Aug 25 '13 at 06:02
  • i see your point now, fixed! – fabien Aug 25 '13 at 06:06
  • No, it isn't fixed at all. There are still different numbers of operations. The indexing only occurs for `i < array.length`. Try something that doesn't depend on `array.length` at all. You're trying to illustrate `O(1)` here, not construct a tricky corner case. – user207421 Aug 25 '13 at 06:10
  • array[i] takes O(1) time as well as doing the addition. Since O(1) + O(1) = O(1), the example was valid. I changed it again though so it always does exactly the same operation in the loop. – fabien Aug 25 '13 at 06:27
  • `array[i]` is evaluated `array.length` times and thereafter not at all under the condition we are discussing. It wasn't `O(1).` – user207421 Aug 25 '13 at 07:18
0

In the first instance, you can learn this kind of thing by:

  • reading a text book (or research paper, or some other resource) that describes the algorithm and analysis its complexity, or

  • analysing the algorithm yourself; i.e. doing the mathematics.

When you get enough experience, you can pick up some intuitive short-cuts for estimating the complexity of an algorithm.


One technique that some people suggest it to measure the performance over a range of values of the problem size variable (or variables), graphing them, and attempting to fit a curve. However, there are problems with this:

  • It is often difficult to know what the variables actually are.
  • Often the relationship between the variables is sufficiently complicated that reliable curve fitting is impossible.
  • It is often difficult to get valid (i.e. representative) measurements.
  • Some algorithms have different complexity measures for best case, average case and/or worst case behaviour.

In short, this approach often doesn't give reliable answers.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216