17

If I have a size N array of objects, and I have an array of unique numbers in the range 1...N, is there any algorithm to rearrange the object array in-place in the order specified by the list of numbers, and yet do this in O(N) time?

Context: I am doing a quick-sort-ish algorithm on objects that are fairly large in size, so it would be faster to do the swaps on indices than on the objects themselves, and only move the objects in one final pass. I'd just like to know if I could do this last pass without allocating memory for a separate array.

Edit: I am not asking how to do a sort in O(N) time, but rather how to do the post-sort rearranging in O(N) time with O(1) space. Sorry for not making this clear.

int3
  • 12,861
  • 8
  • 51
  • 80

9 Answers9

16

I think this should do:

static <T> void arrange(T[] data, int[] p) {
    boolean[] done = new boolean[p.length];        
    for (int i = 0; i < p.length; i++) {
        if (!done[i]) {
            T t = data[i];
            for (int j = i;;) {
                done[j] = true;

                if (p[j] != i) {
                    data[j] = data[p[j]];
                    j = p[j];
                } else {
                    data[j] = t;
                    break;
                }
            }                
        }
    }
}

Note: This is Java. If you do this in a language without garbage collection, be sure to delete done.

If you care about space, you can use a BitSet for done. I assume you can afford an additional bit per element because you seem willing to work with a permutation array, which is several times that size.

This algorithm copies instances of T n + k times, where k is the number of cycles in the permutation. You can reduce this to the optimal number of copies by skipping those i where p[i] = i.

meriton
  • 68,356
  • 14
  • 108
  • 175
  • 1
    If `p` is a temporary array, you can use it in place of the `done` array - just set p[i] to -1 (or some other sentinel value) when data[i] is in its proper place. – Mark Ransom Nov 05 '09 at 21:57
  • Good point, that could optimize memory accesses even further. Of course, it is also somewhat error-prone because the caller might not expected the permutation to be modified. – meriton Nov 05 '09 at 22:07
  • 1
    "done" is currently leaked, so I'm all in favour of that BitSet. – Steve Jessop Nov 05 '09 at 22:30
  • Neglecting possible concurrency issues, you could always in-place modify p[] and then reverse the modification. – Heath Hunnicutt Nov 05 '09 at 22:33
  • @Heath: in this case yes, since p is an array of int containing array indexes, which necessarily are non-negative. So you can use the sign bit as a flag. But that's a fortunate implementation detail - arguably p should be an array of size_t or something. In the context of the question, though, trashing p hopefully isn't a problem, since it's allocated by the quick-sort-ish algorithm itself. In fact, it seems a little unfair for the question to ask for a O(1) space final step to an algorithm which itself uses O(N) extra memory already... – Steve Jessop Nov 05 '09 at 22:46
  • 1
    +1, yes, the solution I was thinking about. If you really need to save space, I think in practice it might be possible to get away without any additional storage, just negate the sign of the indices that were already processed in the permutation array p[]. We start from zero, so we know it's always processed first. At the end, if you need to restore the original array, negate the signs again to make them positive again. – Kris Nov 06 '09 at 09:09
7

Do you mean that you have an array of objects O[1..N] and then you have an array P[1..N] that contains a permutation of numbers 1..N and in the end you want to get an array O1 of objects such that O1[k] = O[P[k]] for all k=1..N ?

As an example, if your objects are letters A,B,C...,Y,Z and your array P is [26,25,24,..,2,1] is your desired output Z,Y,...C,B,A ?

If yes, I believe you can do it in linear time using only O(1) additional memory. Reversing elements of an array is a special case of this scenario. In general, I think you would need to consider decomposition of your permutation P into cycles and then use it to move around the elements of your original array O[].

If that's what you are looking for, I can elaborate more.

EDIT: Others already presented excellent solutions while I was sleeping, so no need to repeat it here. ^_^

EDIT: My O(1) additional space is indeed not entirely correct. I was thinking only about "data" elements, but in fact you also need to store one bit per permutation element, so if we are precise, we need O(log n) extra bits for that. But most of the time using a sign bit (as suggested by J.F. Sebastian) is fine, so in practice we may not need anything more than we already have.

Kris
  • 1,388
  • 6
  • 12
  • This will be a bit tricky. I'd advise programming with care and testing well afterwards. I don't think you need to deal explicitly with cycles as long as the permutation array can be written to. Also, if the arrays are very, very large, locality might be important, and this method can have terrible locality. (Locality = working with a relatively small number of cache lines or memory pages at a time.) – David Thornley Nov 05 '09 at 20:20
  • yes, I will give more details tomorrow - I have to sit down with a pen and paper for a moment. The idea is that each permutation can be decomposed into a product of disjoint cycles and then each cycle moves around elements that requires storing only one temporary value. Our example of reversing the order has permutation [1,2,...,26] -> [26, 25, ... , 2,1] which will be decomposed into cycles (1 26)(2 25) etc... Dealing with one cycle in that case is just a swap: store one value, move the other, restore from temp. But you can generalize this for any cycle length. – Kris Nov 05 '09 at 20:46
  • 1
    I did the detail work, see my answer for the result. However, I don't see how to do that decomposition with O(1) memory, so I allowed myself an additional bit per element to be sorted. – meriton Nov 05 '09 at 21:36
  • 1
    I believe it can't be done in O(1) space, O(n) time. See related question Algorithm to determine if array contains n…n+m? http://stackoverflow.com/questions/177118 The best I'd come up with is to use a sign bit to mark visited items http://stackoverflow.com/questions/177118/algorithm-to-determine-if-array-contains-n-nm/311497#311497 – jfs Nov 06 '09 at 01:40
7

The approach is to follow the "permutation cycles" of the permutation, rather than indexing the array left-to-right. But since you do have to begin somewhere, everytime a new permutation cycle is needed, the search for unpermuted elements is left-to-right:

// Pseudo-code
N : integer, N > 0 // N is the number of elements
swaps : integer [0..N]
data[N] : array of object
permute[N] : array of integer [-1..N]  denoting permutation (used element is -1)
next_scan_start : integer;
next_scan_start = 0;
while (swaps < N ) { // Search for the next index that is not-yet-permtued. for (idx_cycle_search = next_scan_start; idx_cycle_search < N; ++ idx_cycle_search) if (permute[idx_cycle_search] >= 0) break;
next_scan_start = idx_cycle_search + 1;
// This is a provable invariant. In short, number of non-negative // elements in permute[] equals (N - swaps) assert( idx_cycle_search < N );
// Completely permute one permutation cycle, 'following the // permutation cycle's trail' This is O(N) while (permute[idx_cycle_search] >= 0) { swap( data[idx_cycle_search], data[permute[idx_cycle_search] ) swaps ++; old_idx = idx_cycle_search; idx_cycle_search = permute[idx_cycle_search]; permute[old_idx] = -1; // Also '= -idx_cycle_search -1' could be used rather than '-1' // and would allow reversal of these changes to permute[] array } }
Heath Hunnicutt
  • 18,667
  • 3
  • 39
  • 62
  • 2
    You're nearly correct. If you refrained from starting over with the idx_cycle_search after every cycle you'd reach execution time in O(n). Your current algorithm will perform n * (n + 1) / 2 iterations in this loop if permute is the identity transformation. – meriton Nov 05 '09 at 21:51
1

If you didn't mind allocating memory for an extra hash of indexes, you could keep a mapping of original location to current location to get a time complexity of near O(n). Here's an example in Ruby, since it's readable and pseudocode-ish. (This could be shorter or more idiomatically Ruby-ish, but I've written it out for clarity.)

#!/usr/bin/ruby

objects       = ['d', 'e', 'a', 'c', 'b']
order         = [2, 4, 3, 0, 1]
cur_locations = {}

order.each_with_index do |orig_location, ordinality|
  # Find the current location of the item.
  cur_location = orig_location
  while not cur_locations[cur_location].nil? do
    cur_location = cur_locations[cur_location]
  end

  # Swap the items and keep track of whatever we swapped forward.
  objects[ordinality], objects[cur_location] = objects[cur_location], objects[ordinality]
  cur_locations[ordinality] = orig_location
end

puts objects.join(' ')

That obviously does involve some extra memory for the hash, but since it's just for indexes and not your "fairly large" objects, hopefully that's acceptable. Since hash lookups are O(1), even though there is a slight bump to the complexity due to the case where an item has been swapped forward more than once and you have to rewrite cur_location multiple times, the algorithm as a whole should be reasonably close to O(n).

If you wanted you could build a full hash of original to current positions ahead of time, or keep a reverse hash of current to original, and modify the algorithm a bit to get it down to strictly O(n). It'd be a little more complicated and take a little more space, so this is the version I wrote out, but the modifications shouldn't be difficult.

EDIT: Actually, I'm fairly certain the time complexity is just O(n), since each ordinality can have at most one hop associated, and thus the maximum number of lookups is limited to n.

John Hyland
  • 6,855
  • 28
  • 32
1
#!/usr/bin/env python

def rearrange(objects, permutation):
    """Rearrange `objects` inplace according to `permutation`.

       ``result = [objects[p] for p in permutation]``
    """
    seen = [False] * len(permutation)
    for i, already_seen in enumerate(seen):
        if not already_seen: # start permutation cycle
            first_obj, j = objects[i], i
            while True:
                seen[j] = True
                p = permutation[j]
                if p == i: # end permutation cycle
                    objects[j] = first_obj    # [old] p -> j
                    break
                objects[j], j = objects[p], p #       p -> j

The algorithm (as I've noticed after I wrote it) is the same as the one from @meriton's answer in Java.

Here's a test function for the code:

def test():
    import itertools
    N = 9
    for perm in itertools.permutations(range(N)):
        L = range(N)
        LL = L[:]
        rearrange(L, perm)
        assert L == [LL[i] for i in perm] == list(perm), (L, list(perm), LL)

    # test whether assertions are enabled
    try:
        assert 0
    except AssertionError:
        pass
    else:
        raise RuntimeError("assertions must be enabled for the test")

if __name__ == "__main__":
    test()
Community
  • 1
  • 1
jfs
  • 399,953
  • 195
  • 994
  • 1,670
0

There's a histogram sort, though the running time is given as a bit higher than O(N) (N log log n).

Steve B.
  • 55,454
  • 12
  • 93
  • 132
  • I'm afraid that I'm stuck with a quicksort approach.. I'm actually doing some sort of spatial partitioning hierarchy, using the pivot to split the list of objects into two at each level. – int3 Nov 05 '09 at 19:55
0

I can do it given O(N) scratch space -- copy to new array and copy back.

EDIT: I am aware of the existance of an algorithm that will proceed through. The idea is to perform the swaps on the array of integers 1..N while at the same time mirroring the swaps on your array of large objects. I just cannot find the algorithm right now.

Joshua
  • 40,822
  • 8
  • 72
  • 132
0

The problem is one of applying a permutation in place with minimal O(1) extra storage: "in-situ permutation".

It is solvable, but an algorithm is not obvious beforehand.

It is described briefly as an exercise in Knuth, and for work I had to decipher it and figure out how it worked. Look at 5.2 #13.

For some more modern work on this problem, with pseudocode:

http://www.fernuni-hagen.de/imperia/md/content/fakultaetfuermathematikundinformatik/forschung/berichte/bericht_273.pdf

Matt Kennel
  • 134
  • 2
  • I skimmed that paper. It seems to me that they don't claim a worst case time complexity of O(n). – meriton Nov 22 '09 at 16:05
0

I ended up writing a different algorithm for this, which first generates a list of swaps to apply an order and then runs through the swaps to apply it. The advantage is that if you're applying the ordering to multiple lists, you can reuse the swap list, since the swap algorithm is extremely simple.

void make_swaps(vector<int> order, vector<pair<int,int>> &swaps)
{
    // order[0] is the index in the old list of the new list's first value.
    // Invert the mapping: inverse[0] is the index in the new list of the
    // old list's first value.
    vector<int> inverse(order.size());
    for(int i = 0; i < order.size(); ++i)
        inverse[order[i]] = i;

    swaps.resize(0);

    for(int idx1 = 0; idx1 < order.size(); ++idx1)
    {
        // Swap list[idx] with list[order[idx]], and record this swap.
        int idx2 = order[idx1];
        if(idx1 == idx2)
            continue;

        swaps.push_back(make_pair(idx1, idx2));

        // list[idx1] is now in the correct place, but whoever wanted the value we moved out
        // of idx2 now needs to look in its new position.
        int idx1_dep = inverse[idx1];
        order[idx1_dep] = idx2;
        inverse[idx2] = idx1_dep;
    }
}

template<typename T>
void run_swaps(T data, const vector<pair<int,int>> &swaps)
{
    for(const auto &s: swaps)
    {
        int src = s.first;
        int dst = s.second;
        swap(data[src], data[dst]);
    }
}

void test()
{
    vector<int> order = { 2, 3, 1, 4, 0 };

    vector<pair<int,int>> swaps;
    make_swaps(order, swaps);

    vector<string> data = { "a", "b", "c", "d", "e" };
    run_swaps(data, swaps);
}
Brad Larson
  • 170,088
  • 45
  • 397
  • 571
Glenn Maynard
  • 55,829
  • 10
  • 121
  • 131