12

Problem statement:

Find the minimum number of swaps to convert one string into another of the same length, which may or may not have duplicate characters; arbitrary swaps are allowed.

min_swaps('kamal', 'amalk') -> 3
#       1         2        3
# kamal -> lamak -> aamlk -> amalk

Note: There are many of these questions on SO, but none that seem to apply to arbitrary swaps.

Initial Approach:

let s1 = 'kamal'
let s2 = 'amalk'

Assume that s1 is the "correct" ordering, that is its elements map to sequence from 0 -> N-1 in increasing order.

0 1 2 3 4
k a m a l

Now create an array P that is a mapping from letters in s2 to the correct index in s1:

1 2 3 4 0
a m a l k 

P = [1,2,3,4,0]

Now we can count the number of array inversions in P using a modified mergesort, which will give us the number of elements that are out of order.

Modified mergesort:

int main(int argc, char ** argv) {
   int array[] = { 1,2,3,4,0 };

   int array_size = sizeof(array)/sizeof(array[0]);
   int inversions = merge_sort(array, 0, array_size - 1);

   printf("Found %d inversions\n", inversions);
   return 0;
}

int merge_sort(int a[], int start, int end) {
   if ( end > start ) {
      int mid = start + (end - start) / 2;
      int x = merge_sort(a, start, mid);
      int y = merge_sort(a, mid + 1, end);
      int z = merge(a, start, mid, end);

      return x + y + z;
   }

   return 0;
}

int merge(int a[], int start, int mid, int end) {
   int l = start, r = mid + 1;
   int i = 0;
   int temp[end - start + 1];
   int splitInversionCount = 0;

   while ( l <= mid && r <= end ) {
       if ( a[l] < a[r] ) {
          temp[i++] = a[l++];
       }
       else {
         splitInversionCount += mid - l + 1;
         temp[i++] = a[r++];
       }
   }

   // copy whichever half of the array remains
   while ( l <= mid ) {
      temp[i++] = a[l++];
   }

   while ( r <= end ) {
      temp[i++] = a[r++];
   }

   // copy temp back into a
   int k;
   for(k = 0; k < i; k++) {
      a[k + start] = temp[k];
   }

   return splitInversionCount;
}

The problem with this is that it gives the minimum number of swaps with only adjacent elements, this returns 4 instead of 3.

Question:

Is there any way to extend this algorithm to capture arbitrary swaps as well? Or do I need a completely different approach?

Hunter McMillen
  • 59,865
  • 24
  • 119
  • 170
  • Do you have any time constrains? Input size? – jspurim May 15 '14 at 14:30
  • 2
    This question looks as though it's about arbitrary swaps: http://stackoverflow.com/questions/18292202/finding-the-minimum-number-of-swaps-to-convert-one-string-to-another-where-the?rq=1 – David Eisenstat May 15 '14 at 14:31
  • 1
    You are looking for https://en.wikipedia.org/wiki/Cayley%27s_theorem and https://en.wikipedia.org/wiki/Cayley_graph . The problem is roughly equivalent to the Rubik's Cube solution algorithm. – Sergey L. May 15 '14 at 14:32
  • @jspurim No, this is the example case I am working with. The final solution should be able to handle strings of sufficiently large N. The above algorithm for adjacent swaps is `O(n log n)` for reference. – Hunter McMillen May 15 '14 at 14:32
  • You need another approach as you approach not guaranteed a minimal count of swaps. – Толя May 15 '14 at 14:50
  • Marking this as a duplicate of the question referenced by David – Niklas B. May 15 '14 at 14:51
  • @NiklasB. Fair enough, I missed that one in my search; but as you say it doesn't have a good answer. – Hunter McMillen May 15 '14 at 14:52
  • 1
    @HunterMcMillen I just edited my comment because I realized the second answer is actually a pretty good analysis. Also, in any case, if somebody has a better solution, it should be added there – Niklas B. May 15 '14 at 14:52
  • @NiklasB. Which? the one by bsd? – Hunter McMillen May 15 '14 at 14:56
  • 1
    No the one by Subhasis Das. Sorry I forgot that the ordering is non-deterministic :) – Niklas B. May 15 '14 at 14:57
  • Subhasis's answer was subtly wrong. I think that I [fixed it](http://stackoverflow.com/a/23687695/2144669), so I'm joining the close duplicate crowd. – David Eisenstat May 15 '14 at 20:12

1 Answers1

2

I have not a full solution yet, but I think it's useful to state this to help others.

Let's assume the strings have size N. Let's call k the number of characters that are already in their position. Initially k <= N

A swap can have either:

  1. Set 2 characters in the right position. Increase k by 2
  2. Set 1 character in the right position. Increase k by 1
  3. Do nothing useful, k remains the same.

You can always make a move of the second kind, and you can find it in O(n) time. Just take one character out of pos, and find the position where it needs to be, then swap.

You shouldn't make a swap of the first type as soon as you see one, as you could break too other moves that set 2 characters at the same time. If you do so you'll end up setting 4 chars in 3 moves when you could have done it in 2.

Making a move of the third kind may allow making 2 moves of the first kind, so all three kind of moves are useful.

jspurim
  • 925
  • 8
  • 25
  • How to prove your greedy? What is guaranteed that circles with size more than 2 (N members) will not be resolve in N swaps instead of N-1? – Толя May 15 '14 at 15:00
  • @Толя I'm not proposing a greedy algorithm, I'm just stating some thoughts about the problem to try to reach a solution. Actually I'm starting to doubt a greedy exists. I'm reading the link Sergey L. posted, but my math is not that good. – jspurim May 15 '14 at 15:04