101

Started studying about complexity, I'm struggling with this one:

void what(int n) {
    int i;
    for (i = 1; i <= n; i++) {
        int x = n;
        while (x > 0)
            x -= i;
    }
}

Well, the first for loop is clearly O(n). The first iteration is O(n), second one is O(n/2).. and like that log(n) times I guess? Which means O(n) * O(log(n)) = O(n * log(n)) complexity. Did I get this right?

Edit: (not a duplicate) I know what Big O is. I've asked the correct evaluation in a specific case.

chqrlie
  • 131,814
  • 10
  • 121
  • 189
Ilan Aizelman WS
  • 1,630
  • 2
  • 21
  • 44
  • 32
    IMHO is not a duplicate of Plain English explanation of Big O at all. OP knows what Big O is and she/he is asking the correct evaluation in a specific case. – Alessandro Teruzzi Feb 11 '16 at 13:53
  • 4
    Seeing as there is no return value and no side effects, can we be sure the compiler doesn't optimize it away? – jpmc26 Feb 11 '16 at 22:08
  • 4
    Woah.. would you expect such a question to gain such a score? Mysteries of SO... – Eugene Sh. Feb 12 '16 at 20:54
  • 1
    Note, this may also be a trick question. Quoting Wikipedia, "_the time complexity of an algorithm quantifies the amount of time taken by an algorithm to run as **a function of the length** of the string representing the input_". The input size here is fixed and the code terminates for all inputs, so complexity can't be greater than O(1). In fact, any self-respecting optimising compiler will turn the entire method into no-op. – billc.cn Feb 16 '16 at 23:10
  • 3
    @billc.cn Actually, `n` is a parameter to a function here, so you can't know how it will be called. – Eugene Sh. Feb 17 '16 at 14:36

4 Answers4

118

The outer loop runs n times.

For each iteration, the inner loops runs n / i times.

The total number of runs is:

n + n/2 + n/3 + ... + n/n

Asymptotically (ignoring integer arithmetic rounding), this simplifies as

n * (1 + 1/2 + 1/3 + ... + 1/n)

This series loosely converges towards n * log(n).

Hence the complexity is O(N.log(N)) as you expected.

chqrlie
  • 131,814
  • 10
  • 121
  • 189
37

The first inner loop runs n times
The second inner loop runs n/2 times
The third inner loop runs n/3 times
.. The last one runs once

So n + n/2 + n/3 + ... + 1 = n(1+1/2+1/3+1/4 + ... +1/n).

This is n multiplied by the sum of harmonic series, which does not have a nice closed form representation. But as shown here it is O(log(n)). So overall the algorithm is O(n log(n))

Community
  • 1
  • 1
Eugene Sh.
  • 17,802
  • 8
  • 40
  • 61
22

As an alternative, use a variable substitution y = n - x followed by Sigma notation to analyse the number of iterations of the inner while loop of your algorithm.

enter image description here

The above overestimates, for each inner while loop, the number of iterations by 1 for cases where n-1 is not a multiple of i, i.e. where (n-1) % i != 0. As we proceed, we'll assume that (n-1)/i is an integer for all values of i, hence overestimating the total number of iterations in the inner while loop, subsequently including the less or equal sign at (i) below. We proceed with the Sigma notation analysis:

enter image description here

where we, at (ii), have approximated the n:th harmonic number by the associated integral. Hence, you algorithm runs in O(n·ln(n)), asymptotically.


Leaving asymptotic behaviour and studying actual growth of the algorithm, we can use the nice sample data of exact (n,cnt) (where cnt is number of inner iterations) pairs by @chux (refer to his answer), and compare with the estimated number of iterations from above, i.e., n(1+ln(n))-ln(n). It's apparent that the estimation harmonize neatly with the actual count, see the plots below or this snippet for the actual numbers.

enter image description here


Finally note that if we let n->∞ in the sum over 1/i above, the resulting series is the infinite harmonic series, which is, curiously enough, divergent. The proof for the latter makes use of the fact that in infinite sum of non-zero terms naturally is infinite itself. In practice, however, for a sufficiently large but not infinite n, ln(n) is an appropriate approximation of the sum, and this divergence is not relevant for our asymptotic analysis here.


dfrib
  • 70,367
  • 12
  • 127
  • 192
11

Attempting this by testing and graphics. The log2 vs log2 plot looks fairly linear. Something between more than linear O(n) and less than O(n*log(n)). "Draw" your own conclusion.

[Edit]

Using the mathematical derived formulas, the O() calculated is an upper bound of O(n * log(n)). That uses "fractions of loop iterations" that increase the count by a fraction and not 1. E.g. When n is 6, iteration count is 6 + 3 + 2 + 1.5 + 1.2 + 1 = 14.7 loops rather than actual 6 + 3 + 2 + 2 + 2 + 1 = 16. This difference is relatively less significant as n increases, thus the slightly less than O(n * log(n)) growth. So by not ignoring code uses integer math, we have a much more challenging question.


enter image description here

unsigned long long what(int n) {
  unsigned long long cnt = 0;
  int i;
  for (i = 1; i <= n; i++) {
    int x = n;
    while (x > 0) {
      x -= i;
      cnt++;
    }
  }
  return cnt;
}

void wtest(int n) {
  unsigned long long cnt = what(n);
  printf("%d %llu\n", n, cnt);
  fflush(stdout);
}

void wtests(void) {
  int i = INT_MAX/2 + 1;
  while (i > 0) {
    wtest(i);
    i /= 2;
  }
}

int main(void) {
  wtests();
  return 0;
}

Output

1073741824 23567395117
536870912 11411566988
268435456 5519718329
134217728 2666826555
67108864 1286897093
33554432 620190504
16777216 298466265
8388608 143418602
4194304 68802063
2097152 32947406
1048576 15746897
524288 7510048
262144 3573331
131072 1695816
65536 802493
32768 378537
16384 177921
8192 83286
4096 38803
2048 17973
1024 8275
512 3782
256 1713
128 765
64 337
32 145
16 61
8 24
4 9
2 3
1 1
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • 1
    Further analysis : O(n * log(n)) is certainly a worst case - it just does not grow quite that fast. Apparently between O(n * log(n)/log(log(n))) and O(n * log(n)) – chux - Reinstate Monica Feb 11 '16 at 15:23
  • 1
    @dfri By analysis and experimentation, the O() of `what()` is `O(foo(n) * n * ln(n))` where `foo(n)` is TBD. It is not a constant, but a slowly decreasing value with larger `n`. Since it is decreasing, `O(n * ln(n))` represents an upper bound. – chux - Reinstate Monica Feb 11 '16 at 16:54
  • 1
    @dfri Your [fine mathematical analysis](http://stackoverflow.com/a/35342203/2410359), like the other 2 good answers, ignore integer arithmetic rounding. Hence the difference between `O(n * ln(n))` and actual `O()` of `what()`. – chux - Reinstate Monica Feb 11 '16 at 17:00
  • 1
    Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/103212/discussion-between-chux-and-dfri). – chux - Reinstate Monica Feb 11 '16 at 17:03
  • 2
    Sorry for late return. Anyway, I tried your `n` values above against the formula I calculated below, and they [harmonize quite neatly](https://gist.github.com/dfrib/58cf872ded63a5985924). I.e., the actual growth is well captured by `n(1 + ln(n)) - ln(n)`. – dfrib Feb 11 '16 at 20:25