5

In Java, what is faster: to create, fill in and then sort an array of ints like below

int[] a = int[1000];
for (int i = 0; i < a.length; i++) {
    // not sure about the syntax
    a[i] = Maths.rand(1, 500) // generate some random positive number less than 500
}
a.sort(); // (which algorithm is best?)

or insert-sort on the fly

int[] a = int[1000];
for (int i = 0; i < a.length; i++) {
    // not sure about the syntax
    int v = Maths.rand(1, 500) // generate some random positive number less than 500
    int p = findPosition(a, v); // where to insert
    if (a[p] == 0) a[p] = v;
    else {
        shift a by 1 to the right
        a[p] = v;
    }
}
templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
Sophie Sperner
  • 4,428
  • 8
  • 35
  • 55

1 Answers1

10

There are many ways that you could do this:

  1. Build the array and sort as you go. This is likely to be very slow, since the time required to move array elements over to make space for the new element will almost certainly dominate the sorting time. You should expect this to take at best Ω(n2) time, where n is the number of elements that you want to put into the array, regardless of the algorithm used. Doing insertion sort on-the-fly will take expected O(n2) time here.

  2. Build the array unsorted, then sort it. This is probably going to be extremely fast if you use a good sorting algorithm, such as quicksort or radix sort. You should expect this to take O(n log n) time (for quicksort) or O(n lg U) time (for radix sort), where n is the number of values and U is the largest value.

  3. Add the numbers incrementally to a priority queue, then dequeue all elements from the priority queue. Depending on how you implement the priority queue, this could be very fast. For example, using a binary heap here would cause this process to take O(n log n) time, and using a van Emde Boas tree would take O(n lg lg U) time, where U is the largest number that you are storing. That said, the constant factors here would likely make this approach slower than just sorting the values.

Hope this helps!

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
  • Thank you for this interesting answer. The third option is nice for studying, but looks like possible overhead :) – Sophie Sperner Aug 21 '12 at 22:23
  • Just curious, is the `log` in O(n log n) different from the `lg` in O(n lg U)? – Brian J Aug 22 '12 at 03:27
  • @BrianJ- No, it's not. Logarithms in big-O notation don't need to have the base specified, since they're all just constant multiples of one another. I used lg (binary logarithm) in the second case to make it clearer that the lg is derived from the number of bits in U, but for clarity I probably should have been more consistent. – templatetypedef Aug 22 '12 at 03:34
  • @templatetypedef Is that really true? It's like saying the exponents in O(n^e) don't need to be specified. I'd prefer an O(log10(n)) algorithm over an O(log2(n)) algorithm. Wouldn't you? – MikeB Apr 25 '16 at 03:43
  • @MikeB There's a difference between saying that the bases of logs don't matter in big-O notation and saying that the bases of exponents don't matter or that the exponents themselves don't. In the second two cases, if absolutely is significant. The reason that it works for logarithms is that log_b a = log_c a / log_c b for any bases b and c, so all logarithms are just a constant factor off of one another and big-O notation ignores constants. For that reason, I wouldn't know whether a O(log_10 n) or an O(log_2 n) algorithm would be better for the same reason that... – templatetypedef Apr 25 '16 at 04:27
  • @MikeB ... I wouldn't know whether an O(n) or an O(10n) algorithm is better: any algorithm whose runtime is O(n) is also O(10n) and vice-versa, so the constant doesn't actually say anything. – templatetypedef Apr 25 '16 at 04:29