1

I have an object that needs to be sorted, but although it has indices with sortable properties, it's not a list. Instead, it has a swap method. Is there a way to sort it?

My idea was to use a separate list and track the steps needed to sort it so that I can apply each swap to the object, but I can't figure out what to overload.

That is, most (all?) sort algorithms have a swap line inside them that looks like this:

my_list[i], my_list[j] = my_list[j], my_list[i]

and I'd like to get the set of all (i,j) pairs.

I'd rather not re-implement sort on my own.

UPDATE: using a quicksort implementation I found, I got something that's working:

def partition(array, begin, end):
    pivot = begin
    for i in range(begin+1, end+1):
        if array[i] <= array[begin]:
            pivot += 1
            #array[i], array[pivot] = array[pivot], array[i]
            array.swap(i, pivot)
    #array[pivot], array[begin] = array[begin], array[pivot]
    array.swap(pivot, begin)
    return pivot

def quicksort(array, begin=0, end=None):
    if end is None:
        end = len(array) - 1
    def _quicksort(array, begin, end):
        if begin >= end:
            return
        pivot = partition(array, begin, end)
        _quicksort(array, begin, pivot-1)
        _quicksort(array, pivot+1, end)
    return _quicksort(array, begin, end)


class SwapSorter(list):
    def __init__(self, array, obj):
        super().__init__(array)
        self._obj = obj

    def swap(self, i, j):
        if i == j: return
        print("swapping:",i,j)
        self[i], self[j] = self[j], self[i]
        self._obj.swap(i,j)


ss = SwapSorter(array, obj)
quicksort(ss)

print("sorted:")
print(ss)
print(ss._obj)

but (I think) I'd like to just use built-in sort instead of my own... or is this just common practice to roll your own sort?

amos
  • 5,092
  • 4
  • 34
  • 43

1 Answers1

0

The actual answer

It sounds like what you're doing is fine, but I got curious and came up with an answer more in the spirit of what you're asking. See later sections for motivation and how it works, but cutting straight to the chase:

def sort_with_swap(mylist):
    # Source locations of items (satifies: source_index = source_indices[dest_index])
    source_indices = sorted(range(len(mylist)), key=lambda i: mylist[i])

    # Destination locations of items (satisfies: dest_index = destination_indices[source_index])
    destination_indices = [None] * len(source_indices)
    for dest_index, source_index in enumerate(source_indices):
        destination_indices[source_index] = dest_index

    # Perform the swaps
    for dest_index, source_index in enumerate(source_indices):
        # Nothing to do if src==dest (although code below actually works if you remove this)
        if source_index == dest_index:
            continue

        # Swap the source item into the correct destination
        mylist.swap(source_index, dest_index)

        # Update the index lists for the new location of the item that had been in the destination
        old_dest_at_dest_index = destination_indices[dest_index]
        source_indices[old_dest_at_dest_index] = source_index
        destination_indices[source_index] = old_dest_at_dest_index

        # These two updates are not necessary but leave index lists in correct state at the end
        # source_indices[dest_index] = dest_index
        # destination_indices[dest_index] = dest_index

Testing

You can try out the function like this:

class SwappableList(list):
    def swap(self, i, j):
        self[i], self[j] = self[j], self[i]

s = SwappableList("bcdefahg")
print(f"before: {''.join(s)}") # prints "before: bcdefahg"
sort_with_swap(s)
print(f"after:  {''.join(s)}") # prints "after:  abcdefgh"

Motivation

I think your question suggests tracking what is being swapped while a sort is happening, and performing those swaps directly to your list. As you can see, I take a different tack: allow the whole sort to proceed on a list of indices and then swap into place based on that. This has a few benefits:

  1. It's less hacky, because to track what sort is doing while it's happening you'd have to wait for two assignments, check they were really a swap and then stitch them back together.
  2. It won't break if the sort implementation does anything different than swaps, for example permuting three elements, or keeping a pivot item off in a separate location.
  3. It reduces the number of swaps on your list-like structure, which by the sounds of it might be expensive.

How it works

Here is a first naive attempt at my idea:

# Warning: doesn't work
def sort_with_swap_naive(mylist):
    source_indices = sorted(range(len(mylist)), key=lambda i: mylist[i])
    for dest_index, source_index in enumerate(source_indices):
        mylist.swap(source_index, dest_index)

It's pretty simple:

  1. We sort a list of indices instead of the original list. For example, for list('bcdefahg') you get [5, 0, 1, 2, 3, 4, 7, 6].
  2. We go through that list of indices and swap each of the items into their correct places.

Unfortunately, this simple version doesn't work. For a start, if you track the swaps through the last two items ('g' and 'h'), you'll find we swap them into their correct locations and immediately swap them back! The problem is more complicated for the permuted portion of the list, but the underlying problem is the same…

The problem is this: we are using off source_indices to determine which elements to swap, but as we go through the list it gets out of sync. To fix the problem, we also need to update that as we do the swaps. It turns out that to do it efficiently we need a list containing the opposite mapping i.e. the list of destination indices. To see what needs updating here's the original list before and after swapping elements 0 and 5:

mylist = ['b', 'c', 'd', 'e', 'f', 'a', 'h', 'g']
sources = [5, 0, 1, 2, 3, 4, 7, 6]
destinations = [1, 2, 3, 4, 5, 0, 7, 6]

# Swap item at 0 with item at 5
mylist = ['a', 'c', 'd', 'e', 'f', 'b', 'h', 'g']
# updates:      ^                   ^
sources = [0, 5, 1, 2, 3, 4, 7, 6]
# updates: ^  ^
destinations = [0, 2, 3, 4, 5, 1, 7, 6]
# updates:      ^              ^

As you can see (maybe!), the key thing to update is the item that was at the destination index (i.e. the 'b' that was at 0) immediately before the swap.

Jim Oldfield
  • 674
  • 3
  • 8