Optimal bubble sorting algorithm for an array of arrays of numbers

Question

Fix positive integers n and k.

Let A be an array of length n with A[i] an array of length k where every entry is n-i. For example, with n=5 and k=1, this is just

[ [5] , [4] , [3] , [2] , [1] ]

and for n=5 and k=2, this is

[ [5,5] , [4,4] , [3,3] , [2,2] , [1,1] ]

The goal is to bubble sort this array of arrays by swapping numbers in adjacent arrays (e.g. swap A[i][j1] with A[i+1][j2]) until every entry of A[i] is i+1 for every i.

The question is: how many swaps are necessary and what's an optimal algorithm?

NOTE: There are many, many better sorting algorithms to use. However, for this question, I am only interested in applying a bubble sort as described above. I can only interchange entries from adjacent arrays, and I am only interested in the minimum number of such interchanges necessary. I do appreciate all the suggestions for other sorting algorithms, but this is the problem that I am trying to understand.

EXAMPLES:

For k=1, this is well known. The number of swaps is the inversion number of A regarded as a permutation, and so the minimum number of swaps is the binomial coefficient (n choose 2) = n(n-1)/2 and this can be attained by swapping any out of order pair: A[i] > A[j]. For the first example, here's an optimal bubble sort:

[ [5] , [4] , [3] , [2] , [1] ]
[ [4] , [5] , [3] , [2] , [1] ]
[ [4] , [5] , [2] , [3] , [1] ]
[ [4] , [2] , [5] , [3] , [1] ]
[ [4] , [2] , [5] , [1] , [3] ]
[ [4] , [2] , [1] , [5] , [3] ]
[ [4] , [1] , [2] , [5] , [3] ]
[ [1] , [4] , [2] , [5] , [3] ]
[ [1] , [4] , [2] , [3] , [5] ]
[ [1] , [2] , [4] , [3] , [5] ]
[ [1] , [2] , [3] , [4] , [5] ]

For k=2, using the same strategy would give a bound of 2 (n choose 2) swaps needed. For the example above, that means 20 swaps. But there is a solution that uses only 15 swaps:

[ [5,5] , [4,4] , [3,3] , [2,2] , [1,1] ]
[ [5,4] , [5,4] , [3,3] , [2,2] , [1,1] ]
[ [5,4] , [3,4] , [5,3] , [2,2] , [1,1] ]
[ [5,4] , [3,4] , [2,3] , [5,2] , [1,1] ]
[ [5,4] , [3,4] , [2,3] , [1,2] , [5,1] ]
[ [5,4] , [3,4] , [2,1] , [3,2] , [5,1] ]
[ [5,4] , [3,1] , [2,4] , [3,2] , [5,1] ]
[ [1,4] , [3,5] , [2,4] , [3,2] , [5,1] ]
[ [1,4] , [3,2] , [5,4] , [3,2] , [5,1] ]
[ [1,4] , [3,2] , [2,4] , [3,5] , [5,1] ]
[ [1,4] , [3,2] , [2,4] , [3,1] , [5,5] ]
[ [1,4] , [3,2] , [2,1] , [3,4] , [5,5] ]
[ [1,4] , [1,2] , [2,3] , [3,4] , [5,5] ]
[ [1,1] , [4,2] , [2,3] , [3,4] , [5,5] ]
[ [1,1] , [2,2] , [4,3] , [3,4] , [5,5] ]
[ [1,1] , [2,2] , [3,3] , [4,4] , [5,5] ]

This solution is optimal for n=5 and k=2 (proof by brute force to find all solutions). For n=6, the best solution takes 22 swaps, but the solution doesn't look as nice as the one for n=5 (follow the 5 right, then the 1 left, then the 5 right, etc), so I still don't know an optimal strategy, much less a formula or better bound for the number of swaps.

I've been thinking about this for a couple of days now and haven't come up with anything enlightening. If anyone has any thoughts on this problem, then please share them. I'd be thrilled with knowing more about the k=2 case. Even better for any thoughts on the general case.

EDIT: I apologize if I cannot motivate this problem to your liking, but here's an attempt: the number of bubble sorts needed to sort a permutation is a very important statistic in combinatorics and number theory, called the inversion number of the permutation. You can sort an out of order permutation using much better algorithms, but this is the one that gives you the algebraic meaning. If that doesn't help, perhaps this related SO post may: What is a bubble sort good for?

UPDATE: The oldest answer below gives a lower (and upper) bound for the number of swaps. The second oldest answer gives an algorithm that comes really close to this lower bound (often attaining it). It would be fantastic if someone could improve the bound, or, even better, prove that the algorithm given below is optimal.

I don't understand. From your description, the result for `k=1` should be `[ [1], [2], [3], [4], [5] ]`, which you can get in 2 swaps, not 10. Where am I wrong? — svick, Jul 02 '11 at 23:53
@svick: My apologies. I was implicitly assuming you can only swap entries from adjacent arrays. I've now made this assumption explicit in the question. Thanks for pointing out my oversight. — PengOne, Jul 02 '11 at 23:58
are you only concerned by number of swaps (performance issues) or also number of comparisons ? — Yochai Timmer, Jul 03 '11 at 00:28
@Yochai: I don't care at all about comparisons. The only operations I am allowed to do are swaps between entries of adjacent arrays, and I want to minimize these. — PengOne, Jul 03 '11 at 00:34
So you can do any number of comparisons before you start swaping ? — Yochai Timmer, Jul 03 '11 at 01:10
@Yochai: Yes, mostly because we're always starting with the "worst case" array. — PengOne, Jul 03 '11 at 01:14

badawi · Accepted Answer · 2011-07-04T20:31:21.547

This is not an optimal answer, but i would like to share my attempt as someone may improve it. I did not thought about finding a formula to calculate the minimum number of swaps but rather on the optimal algorithm. The algorithm is based on k = 2.

The basic idea is based on information gain. Let us assume that A = {[i,j] : 1<=i<=n, 1<=j<=n} represents a configuration. In each step, we have 4 * (n-1) possible swapping to move from one configuration to another configuration. For example if n = 2 (i.e. A = [ {2,2}, {1,1} ] ), then we have 4 possible swapping A[0][0] <-> A[1][0], A[0][0] <-> A[1][1], A[0][1] <-> A[1][0], and A[0][1] <-> A[1][1]. Thus, our objective is to select the swap that has high information gain when we need to move from one configuration to another configuration.

The tricky part will be "how to calculate the information gain". In my solution (below), the information gain is based on the distance of a value from its correct position. Let me show you my code (written in C++) to understand what i am trying to say:

const int n = 5;
const int k = 2;

int gain(int item, int from, int to)
{
    if (to > from)
        return item - to;
    else
        return to - item ;
}

void swap(int &x, int &y)
{
    int temp = x;
    x = y;
    y = temp;
}

void print_config (int A[][k])
{
    cout << "[";
    for (int i=0; i<n; i++) {
        cout << " [";
        for (int j=0; j<k; j++) {
            cout << A[i][j] << ", ";
        }
        cout << "\b\b], ";
    }
    cout << "\b\b ]" << endl;
}

void compute (int A[][k], int G[][4])
{
    for (int i=0; i<n-1; i++)
    {
        G[i][0] = gain(A[i][0], i+1, i+2) + gain(A[i+1][0], i+2, i+1);
        G[i][1] = gain(A[i][0], i+1, i+2) + gain(A[i+1][1], i+2, i+1);
        G[i][2] = gain(A[i][1], i+1, i+2) + gain(A[i+1][0], i+2, i+1);
        G[i][3] = gain(A[i][1], i+1, i+2) + gain(A[i+1][1], i+2, i+1);
    }
}

int main()
{
    int A[n][k];
    int G[n-1][k*k];

    // construct initial configuration
    for (int i=0; i<n; i++)
        for (int j=0; j<k; j++)
            A[i][j] = n-i;

    print_config(A);

    int num_swaps = 0;
    int r, c;
    int max_gain;

    do {
        compute (A, G);

        // which swap has high info gain
        max_gain = -1;
        for (int i=0; i<n-1; i++)
            for (int j=0; j<k*k; j++)
                if (G[i][j] > max_gain) {
                   r = i;
                   c = j;
                   max_gain = G[i][j];
                }

        // Did we gain more information. If not terminate
        if (max_gain < 0) break;

        switch (c)
        {
            case 0: swap(A[r][0], A[r+1][0]); break;
            case 1: swap(A[r][0], A[r+1][1]); break;
            case 2: swap(A[r][1], A[r+1][0]); break;
            case 3: swap(A[r][1], A[r+1][1]); break;
        }

        print_config(A);
        num_swaps++;

    } while (1);
    cout << "Number of swaps is " << num_swaps << endl;
}

I ran the above code for cases n=1,2,... and 7. Here are the answers (number of swaps) respectively: 0, 2, 5, 10, 15, 23 (very close), and 31. I think that the function gain() does not work well when n is even. Can you confirm that by validating the number of swaps when n = 7. The lower bound of your equation is 31 so this is the optimal number of swaps when n = 7.

I am printing here the output when n = 5 (since you are looking for a pattern):

[ [5, 5],  [4, 4],  [3, 3],  [2, 2],  [1, 1] ]
[ [4, 5],  [5, 4],  [3, 3],  [2, 2],  [1, 1] ]
[ [4, 5],  [3, 4],  [5, 3],  [2, 2],  [1, 1] ]
[ [4, 5],  [3, 4],  [2, 3],  [5, 2],  [1, 1] ]
[ [4, 5],  [3, 4],  [2, 3],  [1, 2],  [5, 1] ]
[ [4, 3],  [5, 4],  [2, 3],  [1, 2],  [5, 1] ]
[ [4, 3],  [2, 4],  [5, 3],  [1, 2],  [5, 1] ]
[ [4, 3],  [2, 4],  [1, 3],  [5, 2],  [5, 1] ]
[ [4, 3],  [2, 4],  [1, 3],  [1, 2],  [5, 5] ]
[ [4, 3],  [2, 1],  [4, 3],  [1, 2],  [5, 5] ]
[ [1, 3],  [2, 4],  [4, 3],  [1, 2],  [5, 5] ]
[ [1, 3],  [2, 4],  [1, 3],  [4, 2],  [5, 5] ]
[ [1, 3],  [2, 1],  [4, 3],  [4, 2],  [5, 5] ]
[ [1, 1],  [2, 3],  [4, 3],  [4, 2],  [5, 5] ]
[ [1, 1],  [2, 3],  [2, 3],  [4, 4],  [5, 5] ]
[ [1, 1],  [2, 2],  [3, 3],  [4, 4],  [5, 5] ]

This is interesting, thanks! My lower bound for n=9 gives 51, so if you can actually get 53 that's pretty close and likely optimal. I'll have to think on this a bit, but thanks so much for this! — PengOne, Jul 04 '11 at 17:25
In case of n = 7, the algorithm gives your lower bound 31, which is the optimal. — badawi, Jul 04 '11 at 18:36

score 4 · Answer 2 · answered Jul 03 '11 at 20:47

I know it's rather tacky to answer one's own question, but I've just figured this out and it is closer to an answer than it is to part of the question. However, this is not a complete answer and will not get accepted, so please post thoughts if anyone can improve this.

The minimum number of swaps, say m, for k=2 is bounded by:

2 * (n choose 2) >= m >= (2n choose 2) / 3

Why does this work?

The upper bound comes doing a bubble sort on the first elements of the arrays, followed by a bubble sort on the second elements of the arrays. That part isn't so tricky.

The lower bound is a bit tricky, but here's how I came to it. Let's count the number of passes, where a pass happens when a larger number moves from the left of a smaller number to the right of that number. This can happen in 1 swap of a and b, with a larger and in the array to the left of b. It can also take 2 swaps if a is moved to the array with b in one swap and then moves on in a later swap. To keep track of things correctly, count passes in halves in this case. To make counting easier, it also counts as a pass when two of the same number split up and then recombine.

The array is fully sorted after (2n choose 2) passes, so the only question is how many passes can happen with one swap. Here's a simple example where a and c are swapped:

... [a,b] , [c,d] ... 
... [c,b] , [a,d] ...

Now let's count the maximum number of passes that can have happened:

Since a > c, we definitely get 1 full pass.
If a > b, then we get 1/2 pass because a must have been left of b at some point.
If a > d, then we get 1/2 pass because a will be right of d at some point.
If c < d, then we get 1/2 pass because d must have been left of c at some point.
If c < b, then we get 1/2 pass because b will be right of c at some point.

Therefore the best you can do on a swap is to get 3 passes (1 full and 4 halves).

Why is this not a complete answer?

I have no idea if the lower bound is always attainable! I don't think it is, and, despite several failed attempts, I can't code up an algorithm that achieves it.

For 5 you get 15 and for 6 you get 22, and these are the numbers you say are best. What do you get for 7? — Daniel, Jul 03 '11 at 21:22

Ricky Bobby · Answer 3 · 2011-07-12T16:23:46.517

Here is an intuitive algorithm I thought of. It gives a constructive proof of the optimal solution I think.

Here is the algorithm :

I tried it for n= 4 5 6 7 9 and it gave the same results as the one from badawi:

The idea is the following:

1: chose one extreme value that is not at his final place ( 1 or n to start)

2: find the extreme value which is the closest to his final position ( marked with an arrow in my example below)

3: If it's among the largest elment,

then move it to the other side and shifht all smallest element of the pair to the left

otherwise

move it to the otherside and shift all the largest element of each pair to the right .

Note: shifting is equivqlent to "bubbling" this value with the smalles (resp largest) element of each pair.

4: go back to step 2, but if you chose one of the large take one of the small and vice versa.

It's pretty intuitive and it seems to work:

Example n=5:

11 22 33 44 55 
^
|
12 23 34 45 51 (4 moves) // shifted all larger numbers to the left
          ^
          |
52 13 24 43 51 (3 moves) // shifted all smaller numbers to the right
   ^
   |
52 34 24 35 11 (3 moves) // shifted all larger numbers to the left
          ^
          |
55 24 34 32 11 (3 moves) // smaller to the right
   ^
   |
55 44  33 22 11 (2 moves) // larger to left

Total 15 moves.

second example n=7:

11 22 33 44 55 66 77 // 6 moves
 ^
12 23 34 45 56 67 71 //5 moves
                ^
72 13 24 35 46 56 71 //5 moves
   ^
72 34 25 36 46 57 11 // 4 moves
                ^
77 24 35 26 36 45 11 //4 moves
   ^
77 45 36 26 35 42 11 //1 move
       ^       
77 65 34 26 35 42 11 //2 moves
         ^
77 65 34 56 34 22 11 //2 moves
          ^
77 66 54 53 34 22 11 //1 move
          ^
77 66 54 45 33 22 11 //1 move
          ^
77 66 55 44 33 22 11

total: 31

Don't hesitate to ask me questions if i'm not clear.

It's pretty easy to do it by hand. You can try it yourself with 6 or 7 or write an algorithm.

I tried it with 6 it gave 23. , with 7 it gave 31 and with 9 it gave 53 , it takes one minute to calculate it by hand without computing anything

Why this solution is optimal :

Each time you move one large element to the opposite side, you move all the smallest one of the pair to the left.

So moving all the large element will not make you lose any move for the moving all smallest one.

You always move you element in "the right direction"

Moreover you for moving the extreme elements you make the minimum number of moves. (this is because the algorithm takes the extreme value closest to his last position that no move is lost)

The reasonning is the same for small element.

This algorithm gives you optimal moves since it doesn't make any unnecessary move.

Hope I didn't make any mistake .

It proves that badawi results were optimal as you expected.

So ?? not a single reaction ? at least to say if it sounds like something true ? — Ricky Bobby, Jul 13 '11 at 15:38
As far as I can tell, this is badawi's algorithm. Your "proof" that this is optimal is not a proof so much as a heuristic. Of the types of moves you're doing, these are the best, but it's unclear that there are not better moves to do. From an information theoretic pov, you want to maximize the number of "passes" as in my bound argument, but your method doesn't guarantee a global optimization so much as a local one. So each move is an optimal move, but it may be that doing suboptimal moves at strategic places gives a better result. — PengOne, Jul 14 '11 at 15:52
@PengOne ok thx I get why it doesn't prove optimality. Thinking about it the only way to prove this solution is the best would be to reach the lower bound (not the case) like in the demonstration in the case k=1. — Ricky Bobby, Jul 17 '11 at 16:20

Optimal bubble sorting algorithm for an array of arrays of numbers

3 Answers3

Here is the algorithm :

Why this solution is optimal :

Linked