-1

For example, I have a Vec<String> and an array storing indexes.

let src = vec!["a".to_string(), "b".to_string(), "c".to_string()];
let idx_arr = [2_usize, 0, 1];

The indexes stored in idx_arr comes from the range 0..src.len(), without repetition or omission.

I want to move the elements in src to another container in the given order, until the vector is completely consumed. For example,

let iter = into_iter_in_order(src, &idx_arr);
for s in iter {
    // s: String
}
// or
consume_vec_in_order(src, &idx_arr, |s| {
    // s: String
});

If the type of src can be changed to Vec<Option<String>>, things will be much easier, just use src[i].take(). However, it cannot.

Edit:

"Another container" refers to any container, such as a queue or hash set. Reordering in place is not the answer to the problem. It introduces the extra time cost of O(n). The ideal method should be 0-cost.

Fuu
  • 165
  • 2
  • 11
  • Are you guaranteed that all elements appear in the list? – Chayim Friedman Feb 02 '23 at 13:20
  • Can you clarify your container requirements? A hash set does not have the same problems as vectors since you can `.remove()` at any key value at any time with `O(1)` complexity. The naive solution would be the best. And its not clear what you mean by "queue", since many queue-like structures do not provide random access. And especially strange since your self answer with the "ideal method" only works with `Vec`. – kmdreko Feb 03 '23 at 16:13

3 Answers3

0

Not sure if my algorithm satisfies your requirements but here I have an algorithm that can consume the provided vector in-order without initializing a new temporary vector, which is more efficient for a memory.

fn main() {
    let src = &mut vec!["a".to_string(), "b".to_string(), "c".to_string(), "d".to_string()];
    let idx_arr = [2_usize, 3, 1, 0];

    consume_vector_in_order(src, idx_arr.to_vec());

    println!("{:?}", src); // d , c , a , b
}

// In-place consume vector in order
fn consume_vector_in_order<T>(v: &mut Vec<T>, inds: Vec<usize>) -> &mut Vec<T>
where
    T: Default,
{
    let mut i: usize = 0;
    let mut temp_inds = inds.to_vec();
    while i < inds.to_vec().len() {
        let s_index = temp_inds[i];
        if s_index != i {
            let new_index = temp_inds[s_index];
            temp_inds.swap(s_index, new_index); 
            v.swap(s_index, new_index);
        } else {
            i += 1;
        }
    }
    v
}

  • Its hard to articulate, but OP appears to want `2` being first to mean `"c"` should be accessed first, rather than the `"a"` (the first element) should be processed as-if at index `2`. So the desired result would print `c, d, b, a`. See OPs own answer for a test case. – kmdreko Feb 02 '23 at 18:45
-1

You can use the technique found in How to sort a Vec by indices? (using my answer in particular) since that can reorder the data in-place from the indices, and then its just simple iteration:

fn consume_vec_in_order<T>(mut vec: Vec<T>, order: &[usize], mut cb: impl FnMut(T)) {
    sort_by_indices(&mut vec, order.to_owned());
    for elem in vec {
        cb(elem);
    }
}

Full example available on the playground.

kmdreko
  • 42,554
  • 6
  • 57
  • 106
  • This code introduces an additional O(n) time overhead. – Fuu Feb 03 '23 at 11:31
  • @Fuu there is no "additional" when talking about runtime complexity. Calling `cb` with `n` elements is already `O(n)`. Your own solution requires calling `.next()` `n` times and has the same order of complexity as shown here. – kmdreko Feb 03 '23 at 16:09
-2

Edit:

An ideal method, but needs to access unstable features and functions not exposed by the standard library.

use std::alloc::{Allocator, RawVec};
use std::marker::PhantomData;
use std::mem::{self, ManuallyDrop};
use std::ptr::{self, NonNull};

#[inline]
unsafe fn into_iter_in_order<'a, T, A: Allocator>(
    vec: Vec<T, A>,
    order: &'a [usize],
) -> IntoIter<'a, T, A> {
    unsafe {
        let mut vec = ManuallyDrop::new(vec);
        let cap = vec.capacity();
        let alloc = ManuallyDrop::new(ptr::read(vec.allocator()));
        let ptr = order.as_ptr();
        let end = ptr.add(order.len());
        IntoIter {
            buf: NonNull::new_unchecked(vec.as_mut_ptr()),
            _marker_1: PhantomData,
            cap,
            alloc,
            ptr,
            end,
            _marker_2: PhantomData,
        }
    }
}

struct IntoIter<'a, T, A: Allocator> {
    buf: NonNull<T>,
    _marker_1: PhantomData<T>,
    cap: usize,
    alloc: ManuallyDrop<A>,
    ptr: *const usize,
    end: *const usize,
    _marker_2: PhantomData<&'a usize>,
}

impl<T, A: Allocator> Iterator for IntoIter<T, A> {
    type Item = T;

    #[inline]
    fn next(&mut self) -> Option<T> {
        if self.ptr == self.end {
            None
        } else {
            let idx = unsafe { *self.ptr };
            self.ptr = unsafe { self.ptr.add(1) };

            if T::IS_ZST {
                Some(unsafe { mem::zeroed() })
            } else {
                Some(unsafe { ptr::read(self.buf.as_ptr().add(idx)) })
            }
        }
    }
}

impl<#[may_dangle] T, A: Allocator> Drop for IntoIter<T, A> {
    fn drop(&mut self) {
        struct DropGuard<'a, T, A: Allocator>(&'a mut IntoIter<T, A>);

        impl<T, A: Allocator> Drop for DropGuard<'_, T, A> {
            fn drop(&mut self) {
                unsafe {
                    // `IntoIter::alloc` is not used anymore after this and will be dropped by RawVec
                    let alloc = ManuallyDrop::take(&mut self.0.alloc);
                    // RawVec handles deallocation
                    let _ = RawVec::from_raw_parts_in(self.0.buf.as_ptr(), self.0.cap, alloc);
                }
            }
        }

        let guard = DropGuard(self);
        // destroy the remaining elements
        unsafe {
            while self.ptr != self.end {
                let idx = *self.ptr;
                self.ptr = self.ptr.add(1);

                let p = if T::IS_ZST {
                    self.buf.as_ptr().wrapping_byte_add(idx)
                } else {
                    self.buf.as_ptr().add(idx)
                };
                ptr::drop_in_place(p);
            }
        }
        // now `guard` will be dropped and do the rest
    }
}

Example:

let src = vec![
    "0".to_string(),
    "1".to_string(),
    "2".to_string(),
    "3".to_string(),
    "4".to_string(),
];
let mut dst = vec![];
let iter = unsafe { into_iter_in_order(src, &[2, 1, 3, 0, 4]) };
for s in iter {
    dst.push(s);
}
assert_eq!(dst, vec!["2", "1", "3", "0", "4"]);

My previous answer:

use std::mem;
use std::ptr;

pub unsafe fn consume_vec_in_order<T>(vec: Vec<T>, order: &[usize], mut cb: impl FnMut(T)) {
    // Check whether `order` contains all numbers in 0..len without repetition
    // or omission.
    if cfg!(debug_assertions) {
        use std::collections::HashSet;

        let n = order.len();
        if n != vec.len() {
            panic!("The length of `order` is not equal to that of `vec`.");
        }
        let mut set = HashSet::<usize>::new();
        for &idx in order {
            if idx >= n {
                panic!("`{idx}` in the `order` is out of range (0..{n}).");
            } else if set.contains(&idx) {
                panic!("`order` contains the repeated element `{idx}`");
            } else {
                set.insert(idx);
            }
        }
    }

    unsafe {
        for &idx in order {
            let s = ptr::read(vec.get_unchecked(idx));
            cb(s);
        }
        vec.set_len(0);
    }
}

Example:

let src = vec![
    "0".to_string(),
    "1".to_string(),
    "2".to_string(),
    "3".to_string(),
    "4".to_string(),
];
let mut dst = vec![];

consume_vec_in_order(
    src,
    &[2, 1, 3, 0, 4],
    |elem| dst.push(elem),
);
assert_eq!(dst, vec!["2", "1", "3", "0", "4"]);
Fuu
  • 165
  • 2
  • 11
  • 2
    This code is bad for multiple reasons: If `cb` panics and unwinds, then you can get double frees. Skipping the checks is unsound since undefined behavior should never be possible for safe code, even in release mode. Transmuting a `Vec` to a `Vec>` is undefined behavior (you can simply use `.set_len(0)` to have the `Vec` forget about those elements) and `drop(vec)` does nothing here since all variables are dropped at the end of their scope. – kmdreko Feb 02 '23 at 04:27
  • @kmdreko Thank you for pointing out these problems. To avoid double freedom, the ideal method is to construct a custom `struct IntoIter` and implement `Iterator` and `Drop` traits for it. However, this requires the use of some functions of `Allocator` and `RawVec`, which is troublesome to use outside the standard library. – Fuu Feb 02 '23 at 08:43
  • So I just changed the answer, adding `unsafe` before `fn consume_vec_in_order`, requiring users to ensure the safety of `order` and `cb` by themselves; and use ".set_len(0)" instead. – Fuu Feb 02 '23 at 08:52