11

I have two slices that are passed from another method:

fn example<T>(a1: &[T], a2: &mut [T]) {}

I want to process a1 with multiple threads and then write to a2 using completely arbitrary indices that are only known when each thread is executed. The indices are guaranteed by my algorithm to be mutually exclusive, so there is no data race.

The borrow checker doesn't like sharing mutable references among threads since it doesn't know of the guarantee our algorithm makes. I also get a lifetime 'static required rustc (E0621) error.

So how to do this in Rust?

Answers to

Do not answer my question.

The answer to the first question addresses the scoping problem, but not the problem of accessing arbitrary mutually disjoint indexes. The answer to the second question suggests as_slice_of_cells but that doesn't work here because of the aforementioned reason, namely the arbitrary access. The answer to the third question similarly suggests as_slice_of_cells but again, the assumption that the array can be split into disjoint parts cannot be fulfilled here. The fourth question again asks about partitioning the array, which we cannot do here. And the same applies to the fifth question.

One answer to the scoping problem (https://stackoverflow.com/a/64502824/10056727) actually attempts to address this problem, but it doesn't suggest to use crossbeam and the alternative suggested is more unsafe than the top answer here.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Niki
  • 738
  • 8
  • 17
  • 1
    This seems like a request for code disguised as a question. Stack Overflow however is not a code-writing service, please [see here](http://stackoverflow.com/help/how-to-ask) to learn how to write effective questions and then come back and [edit] this question. [This](https://idownvotedbecau.se/noattempt/) link also explains why this is worth a downvote. – MindSwipe Dec 07 '20 at 09:33
  • I don't think you understand my question. I've broken down my problem as much as possible and after searching for multiple hours, I haven't found a satisfactory solution. I formulated the problem in this way to make it as clear as possible. – Niki Dec 07 '20 at 09:37
  • 1
    The problem is with how you described your problem, it's just that it seems like you made no attempt to produce the code yourself. Stack Overflow is about specific issues with specific pieces of code, and your question doesn't contain any. This leads to a sense of you demanding us to write the code for you, which isn't what this site is for and will garner downvotes and close votes – MindSwipe Dec 07 '20 at 09:41
  • I did write code, but I thought the question was more concise if stated like this, because the problem itself leaves little for interpretation and is already stated in a very general way. I can add some code, if this will help my chances of getting answers. I may have gotten an answer elsewhere though, and if that is the case I will add it here. I'll get back to this in a few minutes. – Niki Dec 07 '20 at 09:47
  • 1
    I think you want [scoped threads](https://docs.rs/crossbeam/0.8.0/crossbeam/fn.scope.html), likely a dup of [this question](https://stackoverflow.com/questions/32750829/how-can-i-pass-a-reference-to-a-stack-variable-to-a-thread). Since one slice is mutable and the other immutable, they cannot refer to the same memory, so you don't need to guarantee that the indexes don't overlap. Rust should have no problems handling this. – user4815162342 Dec 07 '20 at 09:55
  • 1
    Scoped threads turned out to be part of the answer. I ended up wrapping a pointer to my data in UnsafeCell and implementing Send and Sync for that, as suggested somewhere else. Additionally, I had to use crossbeam's scoped threads. – Niki Dec 07 '20 at 12:57
  • Wrapping it up, I do think I should've asked my question more carefully. Like I said, I got help somewhere else. I'll ask the person for permission to share the code here, then I can edit this question too. I stumbled on a few other people who've asked similar questions, but without any satisfactory answer, so I think this could help others who stumble on this post. – Niki Dec 07 '20 at 13:13
  • I edited my question, I hope that is sufficient. The problem is mainly that all of the other questions don't want truly arbitrary access, so they can just split the array. This doesn't work here. I also updated the question title to make this clearer. – Niki Dec 07 '20 at 20:31
  • https://stackoverflow.com/a/62564887/155423 actually _does_ allow for arbitrary access to the slice, but `Cell` isn't sendable or shareable (I forget) across threads, which is why you have to dip into `unsafe`. – Shepmaster Dec 07 '20 at 20:33
  • *the problem itself leaves little for interpretation* — famous last words. – Shepmaster Dec 07 '20 at 20:36
  • Yes, I should've made it more clear. When I asked the question, I didn't really know myself what I really wanted. I'll make sure to be more clear about why the problem is unique and not answered by other questions next time. – Niki Dec 07 '20 at 20:43
  • Maybe this approach is useful for you: https://stackoverflow.com/a/70851207/286335 – cibercitizen1 Jan 25 '22 at 15:22

4 Answers4

14

You are running into two distinct issues when trying to implement your algorithm:

  1. Sharing non-'static references across threads is not possible with std::thread::spawn.
  2. Writing to mutually disjoint indexes in a slice without synchronization can only be done safely if you can do so by splitting the slice into multiple smaller slices and giving each split slice exclusively to each of the threads.

The first problem is easily avoided by using crossbeam::scope to spawn the threads rather than std::thread::spawn. However the latter issue requires an unsafe solution. However since you know that the indexes are mutually disjoint, there is no data-race in practice, and you can use UnsafeCell to assert to the compiler that there are no data-races. To do this for a slice, you can use the following utility:

use std::cell::UnsafeCell;

#[derive(Copy, Clone)]
pub struct UnsafeSlice<'a, T> {
    slice: &'a [UnsafeCell<T>],
}
unsafe impl<'a, T: Send + Sync> Send for UnsafeSlice<'a, T> {}
unsafe impl<'a, T: Send + Sync> Sync for UnsafeSlice<'a, T> {}

impl<'a, T> UnsafeSlice<'a, T> {
    pub fn new(slice: &'a mut [T]) -> Self {
        let ptr = slice as *mut [T] as *const [UnsafeCell<T>];
        Self {
            slice: unsafe { &*ptr },
        }
    }
    
    /// SAFETY: It is UB if two threads write to the same index without
    /// synchronization.
    pub unsafe fn write(&self, i: usize, value: T) {
        let ptr = self.slice[i].get();
        *ptr = value;
    }
}

This utility allows you to convert an exclusive slice &mut [T] into a slice that can be shared, but still used for mutation. Of course, this means that writing to it can result in a data race, if multiple threads write to the same index without synchronization. Thus the write method is unsafe and will cause UB if this assumption is violated

The UnsafeSlice utility will still guarantee that you have no use-after-free or out-of-bounds errors when using it. Only verification of race conditions is turned off with UnsafeSlice.

To see that the conversion in the constructor is sound, please see the safety comments inside the Cell::from_mut and Cell::as_slice_of_cells methods.

Alice Ryhl
  • 3,574
  • 1
  • 18
  • 37
1

How to do this in Rust? Resorting to unsafe is okay.

If you need two threads to mutate the same data, without locks, then you must use unsafe. It is Undefined Behaviour for two mutable references (&mut) to the same data to even exist, so you would need to access the data via *mut raw pointers, while being extremely careful to avoid data races.

However, depending on your algorithm, you may be able to avoid unsafe by creating non-overlapping &mut references with split_at_mut.

Peter Hall
  • 53,120
  • 14
  • 139
  • 204
  • I guess it will be useful to mention about [`Cell`](https://doc.rust-lang.org/std/cell/struct.Cell.html) here. If the data is wrapped in it, a mutable pointer can be obtained and because of `Cell`, compiler will not make optimisations based on aliasing. – Mihir Luthra Dec 07 '20 at 11:51
  • 1
    @Mihir `Cell` is not `Sync`, so cannot be shared between threads. – Peter Hall Dec 07 '20 at 11:55
  • I didn't exactly mean that. As the OP says that mutual exclusion is managed manually, I think it would be a good idea to wrap a cell type in a struct and implement `Send` and `Sync` for it. `Cell` is `Send`, so there should be no issues sending it to another thread. – Mihir Luthra Dec 07 '20 at 12:03
  • Unfortunately, split_at_mut wasn't enough, I needed truly arbitrary access to the mutable slice. I ended up wrapping a pointer to my data in UnsafeCell and implementing Send and Sync for that, as suggested somewhere else. Additionally, I had to use crossbeam's scoped threads as suggested in one of the comments to my question. Thank you for taking your time! – Niki Dec 07 '20 at 12:56
  • 1
    @Niki, just in case if it helps, you are able to use `UnsafeCell` because an exception is made by the compiler because `UnsafeCell` is attributed with [unsafe_cell lang item](https://doc.rust-lang.org/unstable-book/language-features/lang-items.html) in its definition. So it doesn't cause undefined behaviour as stated in the [reference](https://doc.rust-lang.org/reference/behavior-considered-undefined.html). – Mihir Luthra Dec 07 '20 at 13:02
0

Here is a code using a Vector or AtomicI32 put into an Arc for doing what you need:

use std::sync::Arc;
use std::thread;

use std::sync::atomic::AtomicI32;
use std::sync::atomic::Ordering;

// ==================================================================
// ==================================================================
const SIZE: usize = 4;

// ==================================================================
// ==================================================================
fn new_worker( id: i32, // id for the thread
               a_data: Arc::<Vec<AtomicI32>> // Vec of i32 inside an arc-ptr
             ) -> thread::JoinHandle<()> {

    // create new thread
    // a_data is moved into the thread
    let hand = thread::spawn(move || {

        // a_data[id] <- 2 * id * a_data[id]
        // use load() for readng and store() for writing
        a_data[id as usize].store( 
            2 * id *
            a_data[id as usize].load( Ordering::Relaxed ),
            Ordering::Relaxed );
    });

    return hand;
} // ()

// ==================================================================
// ==================================================================
fn main() {
    let mut data : Vec<AtomicI32> = Vec::new(); //  vector of i32

    // populate the vector
    for _i in 0..SIZE {
        data.push( AtomicI32::new( 1 ) );
    } // for

    // put the array into an arc-ptr: move to the heap, share arc-ptr
    let data_arc = Arc::new(data);

    // share the data through the arc-ptr
    let hand1 = new_worker(1, data_arc.clone());
    let hand2 = new_worker(3, data_arc.clone());

    // wait for the threads to end
    hand1.join().unwrap();
    hand2.join().unwrap();

    // print their work
    println!( "2 == {:?}", data_arc[1].load( Ordering::Relaxed ) );
    println!( "6 == {:?}", data_arc[3].load( Ordering::Relaxed ) );
} // ()
cibercitizen1
  • 20,944
  • 16
  • 72
  • 95
0

Updated answer, previous produced UB (as Chayim Friedman noted). People say that compiler doesn't make optimization assumptions based on aliasing when dealing with raw pointers. If so then following approach shouldn't cause UB, however I'm not 100% sure:

unsafe fn example<T>(a1: *const T, a1_len: usize, a2: *const T, a2_len: usize) {
    let a2 = std::slice::from_raw_parts_mut(a2 as *mut T, a2_len);
    let a1 = std::slice::from_raw_parts(a1, a1_len);
    ...
}
defo900
  • 1
  • 2
  • 1
    Simple and immediate Undefined Behavior. Do not use this. – Chayim Friedman Jul 03 '22 at 01:50
  • UB isn't measured in terms of compiler optimizations. This function as written is completely unsound since it derefs raw pointers (must be `unsafe`), and also, I'm not sure how you intend to use it, but it very much can be UB (although as opposed to the previous version it can also be used correctly). – Chayim Friedman Jul 06 '22 at 08:48
  • Yes, unsafe should be added (update done). – defo900 Jul 06 '22 at 10:56