4

I want to sort (reorder) a Vec in-place by a predefined ordering in Rust.

For example:

let i = vec![0, 3, 2, 1];
let mut v = vec!["a", "b", "c", "d"];
v.sort_by_indices(&i);

assert_eq!(v, &["a", "d", "c", "b"]);

I would like to do this in-place. In my use case, v takes up a lot of memory.

This question is a follow-up to How to get the indices that would sort a Vec?

Michael Hall
  • 2,834
  • 1
  • 22
  • 40
  • 1
    Does mutating(or cloning) indices array`i` is a concern? If it is not please check this [solution](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=f6f88aee8e5145a6499ebe9e292f87a6) – Ömer Erden Oct 29 '21 at 08:56
  • 1
    Something that's not perfectly clear from your example: does `i` contain the index _from_ which the output value should come, or the index _to_ which the input value should go? I.e. if `i = [1, 2, 0]` should the result be `["b", "c", "a"]` or `["c", "a", "b"]`? – Jmb Oct 29 '21 at 09:31
  • I guess what the OP wanted to do is to restore (or 'un-sort') the vector back into the state before the function `argsort` mentioned in the other [post](https://stackoverflow.com/q/69764050/5299417) was called. `i` here is the output of `argsort` and `v` was sorted with `argsort`. If that's the case, I think ["c", "a", "b"] would be the expected result for @Jmb 's case. – Joe_Jingyu Oct 29 '21 at 10:49
  • `i` is the indices that would sort an array. So, in the above example, it says, first, it element 0, second is element 3, third is element 2, etc. I hope that clarifies? – Michael Hall Oct 29 '21 at 23:56
  • @ÖmerErden, mutating or cloning `i` is fine for me. I am more concerned with not duplicating `v`. Your solution works great. Feel free to add it as an answer. – Michael Hall Oct 30 '21 at 00:10
  • Michael, what is the expected result if the indices are [3, 0, 4, 1, 2] and the vector data are ["a", "b", "c", "d", "e"]? I think it should be ["d", "b", "c", "a", "e"]. Can you please confirm? @kmdreko 's solution looks incorrect to me. But, I would like to double confirm the expected behavior with you first before making the call. – Joe_Jingyu Oct 30 '21 at 04:23
  • I would expect the example you give to return `["d", "a", "e", "b", "c"]`. As mentioned in the linked question, I am trying to mirror numpy [argsort](https://numpy.org/doc/stable/reference/generated/numpy.argsort.html) behaviour. – Michael Hall Oct 30 '21 at 05:09
  • 1
    Thanks for clarification Michael. I misunderstood the meaning of the indices. kmdreko's solution looks correct to me now. – Joe_Jingyu Oct 30 '21 at 07:44
  • 1
    My solution is a bit hacky, it works since `Vec::sort` implementation swaps elements by focusing on `is_less`, but this behavior encapsulated, it means backward compatibility is not guaranteed, if you are going to use it please keep this in mind – Ömer Erden Oct 30 '21 at 08:59
  • To help people find this question, perhaps change the word "sort" to "reorder", because the term "sort" typically refers only to the specific reordering that results in a monotonically non-decreasing ordering – RBF06 May 25 '22 at 02:28

3 Answers3

6

An in-place implementation is tricky but possible in O(n) time. It works by chasing indices and swapping elements until it gets back to where it started. Unfortunately, this does require scratch space to keep track of what elements are already sorted. Here's an implementation that reuses the index array if its allowed to consume it:

fn sort_by_indices<T>(data: &mut [T], mut indices: Vec<usize>) {
    for idx in 0..data.len() {
        if indices[idx] != idx {
            let mut current_idx = idx;
            loop {
                let target_idx = indices[current_idx];
                indices[current_idx] = current_idx;
                if indices[target_idx] == target_idx {
                    break;
                }
                data.swap(current_idx, target_idx);
                current_idx = target_idx;
            }
        }
    }
}

See it working on the playground (with improvements by @Jmb).

Otherwise you would need a separate allocation for the scratch space (perhaps a BitVec). Feel free to comment or edit if this method is possible without keeping track of the sorted elements.

kmdreko
  • 42,554
  • 6
  • 57
  • 106
  • This works. What do you mean by "requires scratch space"? Also, seems like [`usize::max_value()` is deprecated in favour of the constant `usize::MAX`](https://doc.rust-lang.org/std/primitive.usize.html#method.max_value) – Michael Hall Oct 30 '21 at 00:20
  • Thanks for the heads up, I've updated it. The "scratch space" required is tracking each element whether its already in its final sorted position or not. It needs this because it may put later elements in the correct place while organizing an earlier element. But it needs to iterate through the whole array anyway. If it didn't, it would undo prior work. – kmdreko Oct 30 '21 at 01:00
  • Another option to avoid the "magic" `SORTED` value is to use `indices[idx] == idx` as the marker for elements that are in the correct place. – Jmb Nov 02 '21 at 07:36
  • @Jmb that works when index hopping, but how do you make the top level for loop ignore those elements? – kmdreko Nov 02 '21 at 16:00
  • Same way: if `indices[idx] == idx`, then the element is already in the right spot so you don't need to move it. – Jmb Nov 02 '21 at 19:34
  • @Jmb can you provide some code? In order for that to work, I'd have to swap elements of `indices` as well as `data`, no? Hmm, I guess that would be a little better. – kmdreko Nov 02 '21 at 19:39
  • Nope, the only thing you need to change is to replace `SORTED` with the appropriate index everywhere: [playground](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=c1a2467fae46b4b44693c3bf1c0e6ce4) – Jmb Nov 03 '21 at 07:25
  • @Jmb doh! I was blind. Thanks for showing me that! I've updated the answer with those changes. – kmdreko Nov 03 '21 at 07:32
1

In the same way it enough to implement yourself

let i = vec![0, 3, 2, 1];
let mut v = vec!["a", "b", "c", "d"];

v = i.into_iter().map(|x| v[x]).collect::<Vec<&str>>();

assert_eq!(v, &["a", "d", "c", "b"]);
Zeppi
  • 1,175
  • 6
  • 11
  • 3
    That's not in place though – Jmb Oct 29 '21 at 08:21
  • I am not claiming that this is the right idea. I'm just saying that the equivalent of the "sort_by_index" function does not exist to my knowledge and that you have to do it yourself. I propose an idea which is what it is there – Zeppi Oct 29 '21 at 09:40
1

You can create a HashMap or BTreeMap lookup table and use it as key searcher:

use std::collections::HashMap;
fn main() {
    let i = vec![0, 3, 2, 1];
    let mut v = vec!["a", "b", "c", "d"];

    let keys: HashMap<_, _> = v.iter().cloned().zip(i.iter()).collect();
    v.sort_by_cached_key(|e| keys[e]);

    assert_eq!(v, &["a", "d", "c", "b"]);
}

Playground

Netwave
  • 40,134
  • 6
  • 50
  • 93