3

This question is similar to the one posed here however, this answer does not work for my question and it is slightly different.

What I am trying to do would best be shown in code:

//this would be a copy version:
int main(){
   std::vector<uint32_t> order = {0,2,5,6,9,10,1,3,4,7,8,11};
   std::vector<uint32_t> example = {0,1,2,3,4,5,6,7,8,9,10,11};
   std::vector<uint32_t> output(order.size());
   for(uint32_t i = 0; i < order.size(); ++i){
       output[i] = example[order[i]];
   }
}
//output comes out to {0,2,5,6,9,10,1,3,4,7,8,11}

However, when I try to implement an in-place version using the reorder code from the link above as such:

void reorder(std::vector<uint32_t> &v, std::vector<uint32_t> const &order )  {   
    for ( int s = 1, d; s < order.size(); ++ s ) {
        for ( d = order[s]; d < s; d = order[d] ) ;
        if ( d == s ) while ( d = order[d], d != s ) std::swap( v[s], v[d] );
    }
}

int main(){
    std::vector<uint32_t> order = {0,2,5,6,9,10,1,3,4,7,8,11};
    std::vector<uint32_t> example = {0,1,2,3,4,5,6,7,8,9,10,11};
    reorder(example, order);
}
//example = {0,6,1,7,8,2,3,9,10,4,5,11,}

How could I implement an in-place version of what I am trying to accomplish without copying memory?

Edit

I wanted to make something more clear, the vector example from the code can be arbitrary elements, I just happened to make them initialized the way I did for the ease of checking them. This would be entirely valid:

std::vector<uint32_t> example = {300,12,21,34,47,15,61,57,82,94,1,2};
  • order and example will always contain the same number of elements
  • It is not guaranteed that order and example store the same datatype
  • It is also guaranteed that order stores unique values
  • It is also guaranteed that order will always be the datatype uint32_t
  • order is always going to be in a range from 0 to n-1 where n is the size of example, Each of those numbers, exactly once, and nothing outside of that range. (like it does in the example code)
  • The actual order of that range is entirely random, has nothing to do with even or odd indices coming in any order.
Aamir
  • 1,974
  • 1
  • 14
  • 18
Sam Moldenha
  • 463
  • 2
  • 11
  • If you're not looking for a totally generic solution, you can recognize that both vectors are the same length, contain the same type and all values in `order` are unique and valid indices into the vectors. You can then element-wise swap the contents of `order` with `std::swap(order[i], example[order[i]]);` and then afterwards `std::swap(example, order);`. That does require `order` to be non-const, however, along with all the other conditions. – paddy Aug 09 '23 at 02:45
  • How can you test "the reorder code" from the other question when it has more than one answer? And it was marked as a duplicate of yet another question! Until you've tried ALL of the presented solutions, you can't claim they don't solve your problem. – Mark Ransom Aug 09 '23 at 02:51
  • @paddy I am honestly just looking for a working solution, I probably should have made this more clear in my question, but the `example` vector can be any random numbers or elements, I just happened to make them like an `std::iota` initialization for an ease of checking. – Sam Moldenha Aug 09 '23 at 02:57
  • 1
    Is it guaranteed that `order` and `example` contain the _same_ number of elements? Is it guaranteed that `order` and `example` store the same data type? Is it guaranteed that every index in `order` is unique? You've provided an update in your answer, but it clarifies none of these things. – paddy Aug 09 '23 at 03:02
  • @paddy hopefully that edit clarifies things more. – Sam Moldenha Aug 09 '23 at 03:06
  • Can the `order` by arbitrary too? This example has a nice order sometimes known as unshuffle or unzip or uninterlace – harold Aug 09 '23 at 03:07
  • Thanks for the clarification, but I think `order` needs a stronger guarantee: it holds all of the numbers from 0 to n-1, where n is the size of `example`. Each of those numbers, exactly once, and nothing outside of that range. – Mark Ransom Aug 09 '23 at 03:10
  • If `order` is _always_ the even indices followed by the odd indices, with each group in sorted order, this is an entirely different class of problem and you don't need `order` at all. – paddy Aug 09 '23 at 03:12
  • @paddy no, so, there is no guarantee on if the even indices are followed by the odd indices, it just happened to work out that way in my example, it is a completely random order. – Sam Moldenha Aug 09 '23 at 03:13

2 Answers2

3

I've come up with a simple algorithm to do this in O(N) time with O(1) additional storage. Unsurprisingly, it does destroy the contents of order in the process. I don't think you can get around that without either increasing time complexity or storage complexity.

template <typename T>
void reorder(std::vector<T> &v, std::vector<uint32_t> &order)
{
    for (size_t i = 0; i < v.size(); i++)
    {
        size_t next = order[i];      // find index of next output
        while (next < i)             // resolve chain of moves
        {
            next = order[next];
        }
        std::swap(v[i], v[next]);    // exchange data
        order[i] = next;             // record where previous data moved to
    }
}

The basic approach is that you iterate through the ordering one at a time, and you consider every position prior to the current position as solved. You will never modify the ordering or the data for solved positions.

Except for the trivial case where order is a sorted vector, you will inevitably need to move a piece of data out of the way, so that you can bring in the required value at that position. There are two possible scenarios:

  1. The value is already in the right place, or is in some "future" position; or
  2. The value points at a "past" position, so we know it has been moved one or more times such that it's now in a "future" position.

So, on a very basic level, when you discover that for some i, the value of order[i] is less than i, then you know it was moved, and that it moved to the position order[order[i]]. From that position, it may have been moved again. You'll know, by applying exactly the same test, until you end up with an index that's greater than or equal to i. And that's where the data moved to.

The secret to making this O(N) is that, after resolving the final "moved-to" position, you do one final step which is to overwrite order[i] with that new position. That means whatever searching you had to do at this position will never be repeated.

Now, it's not at all obvious that this results in linear time complexity, so don't feel bad if it's a bit mind-bending. I found it hard to convince myself, and before writing this answer I actually had to prove it experimentally by looking for the worst-case total searches for all permutations of order. That was also useful to verify the algorithm's correctness.

The reasoning really boils down to the fact that the more "hops" a single value makes, the fewer hops are possible for other values. By collapsing the result of any search you ensure that search is also O(1) in future. Crucially, that means the process of chaining one extra hop is also O(1) because successive "hops" for the same data value will always point back to a search that was flattened when the data was moved.


Here's the little test harness that verifies all permutations and counts the total number of search indirections performed:

#include <algorithm>
#include <cstdint>
#include <iostream>
#include <numeric>
#include <vector>

typedef int Data;

template <typename T>
size_t reorder(std::vector<T> &v, std::vector<uint32_t> &order)
{
    size_t indirections = 0;
    for (size_t i = 0; i < v.size(); i++)
    {
        size_t next = order[i];
        while (next < i)
        {
            // std::cout << "search " << i << " : hop to " << next << "\n";
            next = order[next];
            indirections++;
        }
        std::swap(v[i], v[next]);
        order[i] = next;
    }
    return indirections;
}

size_t count_worst_case_indirections(size_t size)
{
    size_t max_indirections = 0;

    std::vector<uint32_t> order_perm(size);
    std::vector<uint32_t> order_worst;
    std::vector<uint32_t> order;
    std::vector<Data> data(size);
    std::vector<Data> expected;
    expected.reserve(size);

    // Test all possible orderings
    std::iota(order_perm.begin(), order_perm.end(), 0);
    do
    {
        // Reset initial data and generate expected result
        order = order_perm;
        std::iota(data.begin(), data.end(), 0);
        expected.clear();
        for (auto i : order) expected.push_back(data[i]);

        // Run test
        size_t indirections = reorder(data, order);
        if (indirections > max_indirections)
        {
            max_indirections = indirections;
            order_worst = order_perm;
        }

        // Throw if result is invalid
        if (data != expected) throw "ALGORITHM IS BROKEN";
    } while (std::next_permutation(order_perm.begin(), order_perm.end()));

    std::cerr << "worst order : ";
    for (auto x : order_worst) std::cerr << x << ' ';
    std::cerr << "\n";

    return max_indirections;
}

int main()
{
    for (size_t size = 1; size < 12; size++)
    {
        size_t max_indirections = count_worst_case_indirections(size);
        std::cout << "Size " << size << " : " << max_indirections << "\n";
    }
}

stdout:

Size 1 : 0
Size 2 : 1
Size 3 : 2
Size 4 : 3
Size 5 : 4
Size 6 : 5
Size 7 : 6
Size 8 : 7
Size 9 : 8
Size 10 : 9
Size 11 : 10

stderr:

worst order : 0 
worst order : 1 0 
worst order : 1 2 0 
worst order : 1 2 3 0 
worst order : 1 2 3 4 0 
worst order : 1 2 3 4 5 0 
worst order : 1 2 3 4 5 6 0 
worst order : 1 2 3 4 5 6 7 0 
worst order : 1 2 3 4 5 6 7 8 0 
worst order : 1 2 3 4 5 6 7 8 9 0 
worst order : 1 2 3 4 5 6 7 8 9 10 0

This is clearly worst-case O(N) in searches. If you comment-out the line order[i] = next; in reorder, you will see that you get a worst-case of O(N2).

If you then set up a single-fire experiment with a worst-case ordering (there are many worst-case orderings), and uncomment the line of output in the search loop, you'll see exactly why it's important to flatten the search.

paddy
  • 60,864
  • 6
  • 61
  • 103
  • 1
    Winner. I surrender. – selbie Aug 10 '23 at 04:11
  • Haha, well if this is a battle then next time I'll be sure to throw in a bunch of aggressive taunting and some false leads!! ;) Jokes aside, this problem was bugging the heck out of me, so I'm just happy to have found a solution so I can move on with my life. It's neat that it turned out to be a good fit for dynamic programming _and_ had better time complexity than I initially suspected. – paddy Aug 10 '23 at 04:43
  • 1
    This is a *very* nice algorithm. If I were just a tad less honest, I'd be tempted to set up a sock-puppet account just to give it a second up-vote. – Jerry Coffin Aug 10 '23 at 04:45
2

I misunderstood your question on my first attempt. So I deleted that answer and am resubmitting.

Basically, if you need to avoid a copy of the array, that also implies you couldn't use a hash-table (map) solution for the continous re-ordering. So reordering in place becomes an O(N²) runtime algorithm with O(1) storage beyond what was already allocated.

void reorder(std::vector<uint32_t>& items, std::vector<uint32_t>& order)
{

    // n-squared without additional storage
    for (uint32_t i = 0; i < (uint32_t)items.size(); i++)
    {
        if (order[i] == i)
        {
            continue;
        }

        // keep track of the displacement that's about to be done
        uint32_t displacedValue = items[i];
        uint32_t availableIndex = order[i];

        // swap
        items[i] = items[availableIndex];
        items[availableIndex] = displacedValue;

        // scan ahead in orders array to account for the swap
        for (size_t j = i + 1; j < items.size(); j++)
        {
            if (order[j] == i)
            {
                order[j] = availableIndex;
            }
        }
    }
}

Note that the above code will permute and corrupt the `order` table.

Proof of concept using your example:

int main() {
    std::vector<uint32_t> order = { 0,2,5,6,9,10,1,3,4,7,8,11 };
    std::vector<uint32_t> example = { 0,1,2,3,4,5,6,7,8,9,10,11 };

    reorder(example, order);

    // show the final sort order as applied
    for (size_t i = 0; i < example.size(); i++) {
        std::cout << example[i] << " ";
    }
    std::cout << std::endl;
    return 0;
}

The above prints: 0 2 5 6 9 10 1 3 4 7 8 11

Alternate solution

If you can afford the storage cost of making a copy of order without copying example, converting the above code into an O(N) solution for both runtime and storage.

void reorder2(std::vector<uint32_t>& items, std::vector<uint32_t>& order) {

    // create a reverse lookup table on order
    std::unordered_map<uint32_t, uint32_t> reverseLookup; // reverse of order table
    for (uint32_t i = 0; i < (uint32_t)items.size(); i++)
    {
        reverseLookup[order[i]] = i;
    }

    for (uint32_t i = 0; i < (uint32_t)items.size(); i++)
    {
        if (order[i] == i)
        {
            continue;
        }

        // keep track of the displacement that's about to be done
        uint32_t displacedValue = items[i];
        uint32_t availableIndex = order[i];

        // swap
        items[i] = items[availableIndex];
        items[availableIndex] = displacedValue;

        // account for the swap
        uint32_t j = reverseLookup[i];
        order[j] = availableIndex;
        reverseLookup[availableIndex] = j;
    }
}
selbie
  • 100,020
  • 15
  • 103
  • 173
  • I have a gut feeling there must be a clever and non-obvious way to repurpose the already-processed part of `order` at each step, combined with a O(logN) mechanism to determine how to reshuffle values that need to be moved out of the way... a little bit like a heap. But my brain just isn't up to the task. I just can't accept that there doesn't exist an in-place O(N.logN) solution with O(1) storage. – paddy Aug 09 '23 at 23:03
  • @paddy - I had that exact same thought last night when I was hacking this up for the OP. I might tray again later tonight. Feel free to submit an answer yourself if you get something working. I might just upvote it myself! – selbie Aug 09 '23 at 23:41
  • Yeah, I did had a go at it yesterday and failed. I was actually on the right track, but for whatever reason was overlooking a detail. Figured it out today with a fresher mind, and it turns out you can do this in O(N) :) The solution is stupidly simple, but wasn't immediately apparent to the likes of me! – paddy Aug 10 '23 at 04:04