Given a source and a final array, find the number of hops to generate final from the source in less than quadratic time complexity

Question

Suppose you have an array of integers (for eg. [1 5 3 4 6]). The elements are rearranged according to the following rule. Every element can hop forward (towards left) and slide the elements in those indices over which it hopped. The process starts with element in second index (i.e. 5). It has a choice to hop over element 1 or it can stay in its own position.If it does choose to hop, element 1 slides down to index 2. Let us assume it does choose to hop and our resultant array will then be [5 1 3 4 6]. Element 3 can now hop over 1 or 2 positions and the process repeats. If 3 hops over one position the array will now be [5 3 1 4 6] and if it hops over two positions it will now be [3 5 1 4 6].

It is very easy to show that all possible permutation of the elements is possible in this way. Also any final configuration can be reached by an unique set of occurrences.

The question is, given a final array and a source array, find the total number of hops required to arrive at the final array from the source. A O(N^2) implementation is easy to find, however I believe this can be done in O(N) or O(NlogN). Also if it is not possible to do better than O(N2) it will be great to know.

For example if the final array is [3,5,1,4,6] and the source array [1,5,3,4,6], the answer will be 3.

My O(N2) algorithm is like this: you loop over all the positions of the source array from the end, since we know that is the last element to move. Here it will be 6 and we check its position in the final array. We calculate the number of hops necessary and need to rearrange the final array to put that element in its original position in the source array. The rearranging step goes over all the elements in the array and the process loops over all the elements, hence O(N^2). Using Hashmap or map can help in searching, but the map needs to be updated after every step which makes in O(N^2).

P.S. I am trying to model correlation between two permutations in a Bayesian way and this is a sub-problem of that. Any ideas on modifying the generative process to make the problem simpler is also helpful.

Edit: I have found my answer. This is exactly what Kendall Tau distance does. There is an easy merge sort based algorithm to find this out in O(NlogN).

Why "the answer will be 3"? There are 2 hops here - 3 hops to 1st place and 5 hops to second. Also, can we assume integers in array are unique or not? — Alexander Anikin, Nov 11 '16 at 09:19
@AlexanderAnikin You can only hop towards your left not to your right. 3 is in the leftmost position, so it does not have any valid hop. 5 hops to exchange positions with 3 which gives [5,3,1,4,6] with the current number of hops being 1. Then element 1 hops two places towards left to make the current array [1,5,3,4,6]. 4 and 6 are already in their desired position. So the total number of hops come out to be 3. — Rohan Mukherjee, Nov 11 '16 at 15:34
@ee2 The O(N2) algorithm is like this. You loop over all the positions of the source array from the end, since we know that is the last element to move. Here it will be 6 and we check its position in the final array. We calculate the number of hops necessary and need to rearrange the final array to put that element in its original position in the source array. The rearranging step goes over all the elements in the array and the process loops over all the elements, hence O(N^2). Using Hashmap or map can help in searching, but the map needs to be updated after every step which makes in O(N^2) — Rohan Mukherjee, Nov 11 '16 at 15:42
@PatriceGahide For now I am out of luck in getting anything close to O(N) or O(NlogN). — Rohan Mukherjee, Nov 11 '16 at 15:46

score 1 · Accepted Answer · edited May 23 '17 at 12:00

Consider the target array as an ordering. A target array [2 5 4 1 3] can be seen as [1 2 3 4 5], just by relabeling. You only have to know the mapping to be able to compare elements in constant time. On this instance, to compare 4 and 5 you check: index[4]=2 > index[5]=1 (in the target array) and so 4 > 5 (meaning: 4 must be to the right of 5 at the end).

So what you really have is just a vanilla sorting problem. The ordering is just different from the usual numerical ordering. The only thing that changes is your comparison function. The rest is basically the same. Sorting can be achieved in O(nlgn), or even O(n) (radix sort). That said, you have some additional constraints: you must sort in-place, and you can only swap two adjacent elements.

A strong and simple candidate would be selection sort, which will do just that in O(n^2) time. On each iteration, you identify the "leftiest" remaining element in the "unplaced" portion and swap it until it lands at the end of the "placed" portion. It can improve to O(nlgn) with the use of an appropriate data structure (priority queue for identifying the "leftiest" remaining element in O(lgn) time). Since nlgn is a lower bound for comparison based sorting, I really don't think you can do better than that.

Edit: So you're not interested in the sequence of swaps at all, only the minimum number of swaps required. This is exactly the number of inversions in the array (adapted to your particular needs: "non natural ordering" comparison function, but it doesn't change the maths). See this answer for a proof of that assertion.

One way to find the number of inversions is to adapt the Merge Sort algorithm. Since you have to actually sort the array to compute it, it turns out to be still O(nlgn) time. For an implementation, see this answer or this (again, remember that you'll have to adapt).

Thanks for the answer. I was thinking in similar lines so I might be mixing your idea with mine and missing the correct solution. I might be wrong here, but I believe with a sort like this, we are missing the original answer , i.e. the total number of adjacent-hops to get from one source array to the target. — Rohan Mukherjee, Nov 11 '16 at 21:24
@RohanMukherjee Oh, you really don't need the sequence of swaps, only the number of swaps. I kind of missed that ;) I'll think about it tomorrow. Maybe Alexander's answer (which I don't have time to go through now) will do the trick. — Patrice Gahide, Nov 11 '16 at 23:28
Great! Thats exactly what I needed. Pretty neat solution to a very interesting problem. — Rohan Mukherjee, Nov 13 '16 at 01:28

score 1 · Answer 2 · answered Nov 11 '16 at 17:06

From your answer I assume number of hops is total number of swaps of adjacent elements needed to transform original array to final array.

I suggest to use something like insert sort, but without insertion part - data in arrays will not be altered.

You can make queue t for stalled hoppers as balanced binary search tree with counters (number of elements in subtree).

You can add element to t, remove element from t, balance t and find element position in t in O(log C) time, where C is the number of elements in t.

Few words on finding position of element. It consists of binary search with accumulation of skipped left sides (and middle elements +1 if you keep elements on branches) counts.

Few words on balancing/addition/removal. You have to traverse upward from removed/added element/subtree and update counters. Overall number of operations still hold at O(log C) for insert+balance and remove+balance.

Let's t is that (balanced search tree) queue, p is current original array index, q is final array index, original array is a, final array is f.

Now we have 1 loop starting from left side (say, p=0, q=0):

If a[p] == f[q], then original array element hops over the whole queue. Add t.count to the answer, increment p, increment q.
If a[p] != f[q] and f[q] is not in t, then insert a[p] into t and increment p.
If a[p] != f[q] and f[q] is in t, then add f[q]'s position in queue to answer, remove f[q] from t and increment q.

I like the magic that will ensure this process will move p and q to ends of arrays in the same time if arrays are really permutations of one array. Nevertheless you should probably check p and q overflows to detect incorrect data as we have no really faster way to prove data is correct.

Thanks for your comment. I am trying to understand the process. Let us assume that the source array is [5,4,3,1,2] and final array is [2,4,5,3,1]. This means no index does satisfy the a[p] == f[q] constraint till p reaches the end. What happens to t after that? If it follows a[p] != f[q] condition then it probably wont end up with the correct result every time. Do I have anything wrong in my understanding? — Rohan Mukherjee, Nov 11 '16 at 21:09
Well, you are right. It might work for sorted source array (so that ordering of elements is compatible with search tree). In order to make it work in arbitrary numbers array there should be additional "linearization" of final array. — Alexander Anikin, Nov 14 '16 at 16:35

Given a source and a final array, find the number of hops to generate final from the source in less than quadratic time complexity

2 Answers2