How to convert a simple computer algorithm into a mathematical function in order to determine the big o notation?

Question

In my University we are learning Big O Notation. However, one question that I have in light of big o notation is, how do you convert a simple computer algorithm, say for example, a linear searching algorithm, into a mathematical function, say for example 2n^2 + 1?

Here is a simple and non-robust linear searching algorithm that I have written in c++11. Note: I have disregarded all header files (iostream) and function parameters just for simplicity. I will just be using basic operators, loops, and data types in order to show the algorithm.

int array[5] = {1,2,3,4,5};
// Variable to hold the value we are searching for
int searchValue;
// Ask the user to enter a search value
cout << "Enter a search value: ";
cin >> searchValue;
// Create a loop to traverse through each element of the array and find
// the search value
for (int i = 0; i < 5; i++)
{
if (searchValue == array[i])
{
cout << "Search Value Found!" << endl;
}
else
// If S.V. not found then print out a message
cout << "Sorry... Search Value not found" << endl;

In conclusion, how do you translate an algorithm into a mathematical function so that we can analyze how efficient an algorithm really is using big o notation? Thanks world.

I don't think you can do this automatically. You have to think about what is the overall relationship between N (length of the array) and the runtime (number of primitive steps in your program). — Thilo, May 18 '16 at 03:09
At least, don't disregard the virtue of indentation for the sake of readability. — , May 18 '16 at 03:12
It is operations and memory accesses that are usually measured. In the loop the value of i is compared 5 times to the size of the array and incremented 5 times and is retrieved to be used as an array index 5 times. The array is accessed 5 times and compared to the search value 5 times. So notice how every operation and memory access is being done 5 times? Well, 5 is the number of items in the array so n=5 and the mathematical function is Kn (where K is the number of things done 5 times) so the complexity is O(n) because K is a fixed constant number of operations and accesses. — Jerry Jeremiah, May 18 '16 at 05:16
Possible duplicate of [Big O, how do you calculate/approximate it?](http://stackoverflow.com/questions/3255/big-o-how-do-you-calculate-approximate-it) — Paul Hankin, May 18 '16 at 07:50

T. Claverie · Answer 1 · 2016-05-18T07:48:00.857

First, be aware that it's not always possible to analyze the time complexity of an algorithm, there are some where we do not know their complexity, so we have to rely on experimental data.

All of the methods imply to count the number of operations done. So first, we have to define the cost of basic operations like assignation, memory allocation, control structures (if, else, for, ...). Some values I will use (working with different models can provide different values):

Assignation takes constant time (ex: int i = 0;)
Basic operations take constant time (+ - * ∕)
Memory allocation is proportional to the memory allocated: allocating an array of n elements takes linear time.
Conditions take constant time (if, else, else if)
Loops take time proportional to the number of time the code is ran.

Basic analysis

The basic analysis of a piece of code is: count the number of operations for each line. Sum those cost. Done.

int i = 1;
i = i*2;
System.out.println(i);

For this, there is one operation on line 1, one on line 2 and one on line 3. Those operations are constant: This is O(1).

for(int i = 0; i < N; i++) {
    System.out.println(i);
}

For a loop, count the number of operations inside the loop and multiply by the number of times the loop is ran. There is one operation on the inside which takes constant time. This is ran n times -> Complexity is n * 1 -> O(n).

for (int i = 0; i < N; i++) {
    for (int j = i; j < N; j++) {
        System.out.println(i+j);
    }
}

This one is more tricky because the second loop starts its iteration based on i. Line 3 does 2 operations (addition + print) which take constant time, so it takes constant time. Now, how much time line 3 is ran depends on the value of i. Enumerate the cases:

When i = 0, j goes from 0 to N so line 3 is ran N times.
When i = 1, j goes from 1 to N so line 3 is ran N-1 times.
...

Now, summing all this we have to evaluate N + N-1 + N-2 + ... + 2 + 1. The result of the sum is N*(N+1)/2 which is quadratic, so complexity is O(n^2).

And that's how it works for many cases: count the number of operations, sum all of them, get the result.

Amortized time

An important notion in complexity theory is amortized time. Let's take this example: running operation() n times:

for (int i = 0; i < N; i++) {
    operation();
}

If one says that operation takes amortized constant time, it means that running n operations took linear time, even though one particular operation may have taken linear time.

Imagine you have an empty array of 1000 elements. Now, insert 1000 elements into it. Easy as pie, every insertion took constant time. And now, insert another element. For that, you have to create a new array (bigger), copy the data from the old array into the new one, and insert the element 1001. The 1000 first insertions took constant time, the last one took linear time. In this case, we say that all insertions took amortized constant time because the cost of that last insertion was amortized by the others.

Make assumptions

In some other cases, getting the number of operations require to make hypothesises. A perfect example for this is insertion sort, because it is simple and it's running time depends of how is the data ordered.

First, we have to make some more assumptions. Sorting involves two elementary operations, that is comparing two elements and swapping two elements. Here I will consider both of them to take constant time. Here is the algorithm where we want to sort array a:

for (int i = 0; i < a.length; i++) {
    int j = i;
    while (j > 0 && a[j] < a[j-1]) {
        swap(a, i, j);
        j--;
    }
}

First loop is easy. No matter what happens inside, it will run n times. So the running time of the algorithm is at least linear. Now, to evaluate the second loop we have to make assumptions about how the array is ordered. Usually, we try to define the best-case, worst-case and average case running time.

Best-case: We do never enter the while loop. Is this possible ? Yes. If a is a sorted array, then a[j] > a[j-1] no matter what j is. Thus, we never enter the second loop. So, what operations are done in this case is the assignation on line 2 and the evaluation of the condition on line 3. Both take constant time. Because of the first loop, those operations are ran n times. Then in the best case, insertion sort is linear.

Worst-case: We leave the while loop only when we reach the beginning of the array. That is, we swap every element all the way to the 0 index, for every element in the array. It corresponds to an array sorted in reverse order. In this case, we end up with the first element being swapped 0 times, element 2 is swapped 1 times, element 3 is swapped 2 times, etc up to element n being swapped n-1 times. We already know the result of this: worst-case insertion is quadratic.

Average case: For the average case, we assume the items are randomly distributed inside the array. If you're interested in the maths, it involves probabilities and you can find the proof in many places. Result is quadratic.

Conclusion

Those were basics about analyzing the time complexity of an algorithm. The cases were easy, but there are some algorithms which aren't as nice. For example, you can look at the complexity of the pairing heap data structure which is much more complex.

How to convert a simple computer algorithm into a mathematical function in order to determine the big o notation?

1 Answers1