3

The question is rather simple, but I just can't find a good enough answer. On the most upvoted SO question regarding the big-O notation, it says that:

For example, sorting algorithms are typically compared based on comparison operations (comparing two nodes to determine their relative ordering).

Now let's consider the simple bubble sort algorithm:

for (int i = arr.length - 1; i > 0; i--) {
    for (int j = 0; j < i; j++) {
        if (arr[j] > arr[j+1]) {
            switchPlaces(...)
        }
    }
}

I know that worst case is O(n²) and best case is O(n), but what is n exactly? If we attempt to sort an already sorted algorithm (best case), we would end up doing nothing, so why is it still O(n)? We are looping through 2 for-loops still, so if anything it should be O(n²). n can't be the number of comparison operations, because we still compare all the elements, right?

Community
  • 1
  • 1
Mumfi
  • 405
  • 4
  • 15
  • 4
    n is the length of the array to be sorted. Even if the array is sorted, you need n - 1 comparisons to verify that. – Niklas B. May 31 '14 at 21:17
  • What are "WC" and "BC" supposed to be? Please avoid using uncommon abbreviations, you can't be sure that people will understand what they expand to. –  May 31 '14 at 21:19
  • 1
    Assuming WC and BC mean worst case and best case, the algorithm you've shown doesn't have best-case O(n), because it doesn't exit early when it finds the array is sorted. – user2357112 May 31 '14 at 21:20
  • @NiklasB. n is the length of the array? How can that be? The length of the array doesn't change for the worst case of O(n^2)? – Mumfi May 31 '14 at 21:28
  • @eitank that is incorrect. When dealing with sorting algorithms, N represents the total number of elements to sort. The number of times you access memory would be a function `f(n)`, i.e. a function of N. –  May 31 '14 at 21:32
  • @Mumfi please look up the [formal definition of Big-O](http://en.wikipedia.org/wiki/Big_O_notation#Formal_definition). –  May 31 '14 at 21:51

4 Answers4

5

When analyzing the Big-O performance of sorting algorithms, n typically represents the number of elements that you're sorting.

So, for example, if you're sorting n items with Bubble Sort, the runtime performance in the worst case will be on the order of O(n2) operations. This is why Bubble Sort is considered to be an extremely poor sorting algorithm, because it doesn't scale well with increasing numbers of elements to sort. As the number of elements to sort increases linearly, the worst case runtime increases quadratically.

Here is an example graph demonstrating how various algorithms scale in terms of worst-case runtime as the problem size N increases. The dark-blue line represents an algorithm that scales linearly, while the magenta/purple line represents a quadratic algorithm.

Notice that for sufficiently large N, the quadratic algorithm eventually takes longer than the linear algorithm to solve the problem.

Running time graphs

Graph taken from http://science.slc.edu/~jmarshall/courses/2002/spring/cs50/BigO/.

See Also

  • So that's just for sorting algorithms? Is this just a common convention then? I mean, let's say you have two for-loops, one nested in the other, both go from 0 to n, and after the inner for-loop some addition is being done. O would be (n^2). What is n then? Just the number of + operations? – Mumfi May 31 '14 at 21:26
  • 2
    When you're doing Big-O analysis in general, N represents the size of your problem. In your double-loop example, what is the number `n` supposed to represent? –  May 31 '14 at 21:28
  • Just some arbitrary number. Kinda like this: http://stackoverflow.com/questions/526728/time-complexity-of-nested-for-loop – Mumfi May 31 '14 at 21:35
5

I think two things are getting confused here, n and the function of n that is being bounded by the Big-O analysis.

By convention, for any algorithm complexity analysis, n is the size of the input if nothing different is specified. For any given algorithm, there are several interesting functions of the input size for which one might calculate asymptotic bounds such as Big-O.

The commonest such function for a sorting algorithm is the worst case number of comparisons. If someone says a sorting algorithm is O(n^2), without specifying anything else, I would assume they mean the worst case comparison count is O(n^2), where n is the input size.

Another interesting function is the amount of work space, of space in addition to the array being sorted. Bubble sort's work space is O(1), constant space, because it only uses a few variables regardless of the array size.

Bubble sort can be coded to do only n-1 array element comparisons in the best case, by finishing after any pass that does no exchanges. See this pseudo code implementation, which uses swapped to remember whether there were any exchanges. If the array is already sorted the first pass does no exchanges, so the sort finishes after one pass.

Patricia Shanahan
  • 25,849
  • 4
  • 38
  • 75
  • Good answer. However, I think the original poster might also be confused as to why the best case runtime of bubble sort (according to him) is O(n) for an already sorted array, when the algorithm uses two nested for-loops. Or something like that. The original question was a little confusing and unclear. –  May 31 '14 at 21:55
3

n is usually the size of the input. For array, that would be the number of elements.

To see the different cases, you would need to change the algorithm:

for (int i = arr.length - 1; i > 0 ; i--) {
    boolean swapped = false;

    for (int j = 0; j<i; j++) {
        if (arr[j] > arr[j+1]) {
            switchPlaces(...);
            swapped = true;
        }
    }

    if(!swapped) {
        break;
    }
}

Your algorithm's best/worst cases are both O(n^2), but with the possibility of returning early, the best-case is now O(n).

fgb
  • 18,439
  • 2
  • 38
  • 52
  • I'll give you the best answer, because it addressed only and exactly what was bothering my mind. – Mumfi May 31 '14 at 22:15
0

n is array length. You want to find T(n) algorithm complexity.

It is much expensive to access memory then check condition if. So, you define T(n) to be number of access memory.

In the given algorithm BC and WC use O(n^2) accesses to memory because you check the if-condition O(n^2) times.

Make the complexity better: Hold a flag and if you don't do any swaps in the main-loop, it means your array is sorted and you can put a break.

Now, in BC the array is sorted and you access all elements once so O(n). And in WC still O(n^2).

eitank
  • 330
  • 1
  • 5
  • But in the case of a sorted array, all I do is check the if-condition? And I do it O(n^2) times! Sorry, still a bit confused. – Mumfi May 31 '14 at 21:42
  • Use more advanced version of bubble sort to prevent no needed access to memory. – eitank May 31 '14 at 22:04