1

There have been other questions about bubble sort time complexity, but this question is different. Everyone says that bubble sort worst case is O(n^2). In the bubble sort after i iterations of the list, the last i elements of the list are in order and don't need to be ever touched or compared again. The time complexity would only be O(n^2) if you needlessly ran over the final elements again and again.

Given that a major feature of the bubble sort is that the elements after (input size minus iteration) never need to be compared again, because it's in its correct place, why is bubble sort time complexity said to be that for something that to me I didn't think was bubble sort? Even in Wikipedia it says the time complexity is O(n^2), and then only halfway into the article it mentions that it can be "optimised" to take only about 50% of the time by not unnecessarily comparing the last i elements.

I was reminded of this because I was making a loop which checked collisions of all my objects in the world, and the pattern was that I checked:

for (int i = 0; i < numberofobjects - 1; i++)
{
   {
      for (int iplusone = i + 1; iplusone < numberofobjects; iplusone++)
        // check collision between i and iplusone
   }
}

With 400 objects a time complexity of O(n^2) would be 400 * 400 = 160,000. However it only did 79,800 comparisons, roughly 50%, which is exactly what Wikipedia said. This reminded me of the bubble sort so when I checked I was surprised to see everyone saying it was O(n^2).

Does this mean that whenever someone refers to the bubble sort they're referring to the version that needlessly reiterates over the final elements that have already been sorted? Also when different algorithms are compared bubble sort always fares the worse, but is the writer referring to the obviously bad n^2 version?

coder
  • 12,832
  • 5
  • 39
  • 53
Zebrafish
  • 11,682
  • 3
  • 43
  • 119
  • Gauss sez: `Sum(i = 1, n){i} = (n^2+n)/2` highest term is `n^2`. – EOF Oct 13 '17 at 10:47
  • 1
    All bubble sort variants are O(n^2). This just means that the number of comparisons grows as n^2 grows, with some scale and constant offset. So if you're sorting 100 elements, you will perform something on the order of 10,000 comparisons. It may be half that, or it may be double that, but it still grows as a quadratic. – Tom Karzes Oct 13 '17 at 10:49
  • @Tom Karzes I see, so it doesn't actually say anything about the particular instance, just how it grows? That doesn't seem as useful. Also, when you say you'll perform something on the order of 10,000 comparisons, not true if you use the bubble sort I'm thinking of. – Zebrafish Oct 13 '17 at 10:55
  • It sounds like you're either (a) not using bubble sort at all or (b) you don't understand the inherent complexity of the algorithm (i.e., the amount of work it's doing). You can test it though: Sort 1,000 randomly shuffled values, and time it. Then do the same for 10,000 values. The run time should increase by a factor of roughly 100. That's n^2. – Tom Karzes Oct 13 '17 at 10:59
  • Constants in big-O are thrown away. O(n^2/2) = O(n^2). big-O only applies to growth of functions and applying it to some particular case is meaningless. – Art Oct 13 '17 at 11:03
  • @Tom Karzes No I think I don't understand the time complexity notation. Because when I see worse case of bubble sort being O(n^2) I expected that worst case if I ran a bubble sort would be input * input complexity, whereas I know it's nothing close to that, but roughly half. So I think I misunderstand what the big O means. I think it only means how it grows. Correct me if I'm wrong. – Zebrafish Oct 13 '17 at 11:03
  • 1
    Half or double doesn't matter, it still grows as n^2. If you double the input size, it takes 4 times as long to run. If you increase the input size by a factor of 10, it takes 100 times as long to run. O(n^2), O(0.5*n^2), and O(2*n^2) are all the same, and people use O(n^2) to indicate all of them. It is the measure that you want, because it lets you quickly estimate how large a data set you can handle in an acceptable amount of time. – Tom Karzes Oct 13 '17 at 11:06
  • @Tom Karzes Thanks, I'm understanding now slowly. I realise that as another commenter said constants are thrown away, but when you say it's the measure you want to be able to determine an acceptable amount of time, if constants don't matter, and that constant is for example one tenth, that's a difference of ten times, surely that's something that's important? I mean I realise Big O doesn't say anything about that, just saying. – Zebrafish Oct 13 '17 at 11:11
  • 4
    It's not that a constant factor doesn't matter, it's just that it is dwarfed in the limit case by the growth. The simplest sort algorithms are O(n^2). The fastest sort algorithms are O(n*log(n)). Now suppose your slow algorithm is heavily optimized and takes only 0.5*n^2, and your fast algorithm is poorly optimized and takes 2*n*log(n). Now look at the complexity when n is 1000: The slow algorithm takes 0.5*1000*1000 = 500,000, while the fast algorithm takes 2*1000*10 = 20,000, so it's 25 times faster (and i'm conservatively using log2 here). And that ratio keeps growing as n grows. – Tom Karzes Oct 13 '17 at 11:20
  • @Tom Karzes Thanks, that's a good explanation. – Zebrafish Oct 13 '17 at 11:22

1 Answers1

5

With 400 objects a time complexity of O(n^2) would be 400 * 400 = 160,000. However it only did 79,800 comparisons, roughly 50%

Yes you're right about the 79,800 comparisons but you don't get very well big O notation.

First of all if you see carefully bubble sort algorithm you will notice that the exact steps-comparisons are:

n-1 + n-2 + ... + 1 = n(n-1)/2 exactly

This means that with n=400 you get exactly 400*399/2=79,800 comparisons.

Though the big O notation tells you that the total steps are: n(n-1)/2 = n^2/2 - n/2 and in big O notation we ignore lower order terms and constants and we keep only n^2 so it is O(n^2).

What you need to understand here is that big O notation doesn't tell you the exact steps it just tells you an upper bound e.g the higher order of your complexity function, and this is for Big values on n. It simply states that "for big n the complexity-order of growth is c*n^2" - it describes the limiting behavior of a function when the argument tends towards a particular value or infinity .

coder
  • 12,832
  • 5
  • 39
  • 53
  • 1
    So when someone compares algorithms, and says bubble sort is bad because it's O(n^2), when as far as I see it really isn't for the bubble sort I'm thinking of, does it mean that it's actually better than a lot of people say compared to others? – Zebrafish Oct 13 '17 at 11:00
  • No the bubble sort algorithm is ONE, it is the one you've mentioned. In order to compare algorithm you can't compute `exact comparisons` since this may be difficult and we just want to know the asymptotic complexity for big Input, so this is the case that big O is useful. An O(nlogn) algorithm (merge sort) is much better than O(n^2) (asymptotically) and thus is a better algorithm. This is only for asymptotic behavior (that we care) e.g for big N (N is input)... – coder Oct 13 '17 at 11:04
  • Just to understand it better imagine N = 10 ^ 20. An O(n^2) algorithm does about 10^40 steps ,and we don't care if it does 10^40/2 (or any constant division) still it is too much and far worst than a O(nlogn) algorithm that does about 20*10^20 steps there is a huge difference for large n. – coder Oct 13 '17 at 11:07
  • 1
    Thanks. I agree with you. But if constants aren't taken into account, I'm thinking that the constants might make a huge difference, for example if the constant is O(n^2 / 50), and Big O still considers that only n^2, that's a massive difference and could even approach of be faster than the O(n log n) case depending on the constant. – Zebrafish Oct 13 '17 at 11:17
  • May these are some disadvantages of using Big O, of course an algorithm with exactly n^2 steps is worse in run time than n^2/50. But what the Big O emphasize is that for big n there is not so much difference since only the higher order counts. To understand it better big O tells you **only asymptotic** behavior e.g for very large n - does it really makes difference n^2 or n^2/50 when n= 10^20 or 10^100 ,well nothing at all!! – coder Oct 13 '17 at 11:22
  • Hi, I was just looking at what you wrote, the n(n-1)/2 part works out, but the n + n-1 + ... + 1 = n(n-1)/2 don't seem equal when trying it with numbers. When n = 3 the left side is 6 and the right side is 3. – Zebrafish Oct 13 '17 at 11:48