tl;dr; I describe a couple of examples first, I analyze the complexity of the stated problem of OP at the bottom of this post
In short, the big O notation tells you something about how a program is going to perform if you scale the input.
Imagine a program (P0) that counts to 100. No matter how often you run the program, it's going to count to 100 as fast each time (give or take). Obviously right?
Now imagine a program (P1) that counts to a number that is variable, i.e. it takes a number as an input to which it counts. We call this variable n
. Now each time P1 runs, the performance of P1 is dependent on the size of n
. If we make n
a 100, P1 will run very quickly. If we make n
equal to a googleplex, it's going to take a little longer.
Basically, the performance of P1 is dependent on how big n
is, and this is what we mean when we say that P1 has time-complexity O(n)
.
Now imagine a program (P2) where we count to the square root of n
, rather than to itself. Clearly the performance of P2 is going to be worse than P1, because the number to which they count differs immensely (especially for larger n
's (= scaling)). You'll know by intuition that P2's time-complexity is equal to O(n^2)
if P1's complexity is equal to O(n)
.
Now consider a program (P3) that looks like this:
var length= input.length;
for(var i = 0; i < length; i++) {
for (var j = 0; j < length; j++) {
Console.WriteLine($"Product is {input[i] * input[j]}");
}
}
There's no n
to be found here, but as you might realise, this program still depends on an input called input
here. Simply because the program depends on some kind of input, we declare this input as n
if we talk about time-complexity. If a program takes multiple inputs, we simply call those different names so that a time-complexity could be expressed as O(n * n2 + m * n3)
where this hypothetical program would take 4 inputs.
For P3, we can discover it's time-complexity by first analyzing the number of different inputs, and then by analyzing in what way it's performance depends on the input.
P3 has 3 variables that it's using, called length
, i
and j
. The first line of code does a simple assignment, which' performance is not dependent on any input, meaning the time-complexity of that line of code is equal to O(1)
meaning constant time.
The second line of code is a for
loop, implying we're going to do something that might depend on the length of something. And indeed we can tell that this first for
loop (and everything in it) will be executed length
times. If we increase the size of length
, this line of code will do linearly more, thus this line of code's time complexity is O(length)
(called linear time).
The next line of code will take O(length)
time again, following the same logic as before, however since we are executing this every time execute the for
loop around it, the time complexity will be multiplied by it: which results in O(length) * O(length) = O(length^2)
.
The insides of the second for
loop do not depend on the size of the input (even though the input is necessary) because indexing on the input (for arrays!!) will not become slower if we increase the size of the input. This means that the insides will be constant time = O(1)
. Since this runs in side of the other for
loop, we again have to multiply it to obtain the total time complexity of the nested lines of code: `outside for-loops * current block of code = O(length^2) * O(1) = O(length^2).
The total time-complexity of the program is just the sum of everything we've calculated: O(1) + O(length^2) = O(length^2) = O(n^2)
. The first line of code was O(1)
and the for
loops were analyzed to be O(length^2)
. You will notice 2 things:
- We rename
length
to n
: We do this because we express
time-complexity based on generic parameters and not on the ones that
happen to live within the program.
- We removed
O(1)
from the equation. We do this because we're only
interested in the biggest terms (= fastest growing). Since O(n^2)
is way 'bigger' than O(1)
, the time-complexity is defined equal to
it (this only works like that for terms (e.g. split by +
), not for
factors (e.g. split by *
).
OP's problem
Now we can consider your program (P4) which is a little trickier because the variables within the program are defined a little cloudier than the ones in my examples.
for (int i = n; i > 1; i /= 3) {
for (int j = 0; j < n; j += 2) {
... ...
}
for (int k = 2; k < n; k = (k * k) {
...
}
}
If we analyze we can say this:
- The first line of code is executed
O(cbrt(3))
times where cbrt
is the cubic root of it's input. Since i
is divided by 3 every loop, the cubic root of n
is the number of times the loop needs to be executed before i
is smaller or equal to 1.
- The second
for
loop is linear in time because j
is executed
O(n / 2)
times because it is increased by 2 rather than 1 which
would be 'normal'. Since we know that O(n/2) = O(n)
, we can say
that this for loop is executed O(cbrt(3)) * O(n) = O(n * cbrt(n))
times (first for
* the nested for
).
- The third
for
is also nested in the first for
, but since it is not nested in the second for
, we're not going to multiply it by the second one (obviously because it is only executed each time the first for
is executed). Here, k
is bound by n
, however since it is increased by a factor of itself each time, we cannot say it is linear, i.e. it's increase is defined by a variable rather than by a constant. Since we increase k
by a factor of itself (we square it), it will reach n
in 2
log(n)
steps. Deducing this is easy if you understand how log
works, if you don't get this you need to understand that first. In any case, since we analyze that this for loop will be run O(
2
log(n))
time, the total complexity of the third for
is O(cbrt(3)) * O(
2
log(n))
= O(cbrt(n) *
2
log(n))
- The total time-complexity of the program is now calculated by the sum of the different sub-timecomplexities:
O(n * cbrt(n)) + O(cbrt(n) *
2
log(n))
As we saw before, we only care about the fastest growing term if we talk about big O notation, so we say that the time-complexity of your program is equal to O(n * cbrt(n))
.