Problem statement:
Find the minimum number of swaps to convert one string into another of the same length, which may or may not have duplicate characters; arbitrary swaps are allowed.
min_swaps('kamal', 'amalk') -> 3
# 1 2 3
# kamal -> lamak -> aamlk -> amalk
Note: There are many of these questions on SO, but none that seem to apply to arbitrary swaps.
Initial Approach:
let s1 = 'kamal'
let s2 = 'amalk'
Assume that s1 is the "correct" ordering, that is its elements map to sequence from 0 -> N-1
in increasing order.
0 1 2 3 4
k a m a l
Now create an array P that is a mapping from letters in s2
to the correct index in s1:
1 2 3 4 0
a m a l k
P = [1,2,3,4,0]
Now we can count the number of array inversions in P
using a modified mergesort, which will give us the number of elements that are out of order.
Modified mergesort:
int main(int argc, char ** argv) {
int array[] = { 1,2,3,4,0 };
int array_size = sizeof(array)/sizeof(array[0]);
int inversions = merge_sort(array, 0, array_size - 1);
printf("Found %d inversions\n", inversions);
return 0;
}
int merge_sort(int a[], int start, int end) {
if ( end > start ) {
int mid = start + (end - start) / 2;
int x = merge_sort(a, start, mid);
int y = merge_sort(a, mid + 1, end);
int z = merge(a, start, mid, end);
return x + y + z;
}
return 0;
}
int merge(int a[], int start, int mid, int end) {
int l = start, r = mid + 1;
int i = 0;
int temp[end - start + 1];
int splitInversionCount = 0;
while ( l <= mid && r <= end ) {
if ( a[l] < a[r] ) {
temp[i++] = a[l++];
}
else {
splitInversionCount += mid - l + 1;
temp[i++] = a[r++];
}
}
// copy whichever half of the array remains
while ( l <= mid ) {
temp[i++] = a[l++];
}
while ( r <= end ) {
temp[i++] = a[r++];
}
// copy temp back into a
int k;
for(k = 0; k < i; k++) {
a[k + start] = temp[k];
}
return splitInversionCount;
}
The problem with this is that it gives the minimum number of swaps with only adjacent elements, this returns 4
instead of 3
.
Question:
Is there any way to extend this algorithm to capture arbitrary swaps as well? Or do I need a completely different approach?