5

There's a very nice split_at_mut function which can take 1 slice and make it into 2... is there a way to undo that operation so I can get back to my original array again--lets assume I know that they are contiguous in memory (because I just split them)

The question is: is there something similar to join_mut like so:

fn main() {
    let mut item : [u8;32] = [0u8;32];
    let (mut first, mut second) = item[..].split_at_mut(16);
    first[0] = 4;
    second[0] = 8;
    let mut x = first.join_mut(first, second); // <-- compile error
    assert_eq(x[16], 8);
}
hellcatv
  • 573
  • 4
  • 21
  • 4
    I'm not sure I understand. Since `first` and `second` are just slices of `item`, why not use `item` itself instead of trying to join them back? Replacing `x` with `item` as in `assert_eq!(item[16], 8);` would pass the test (assuming you returned the borrowed slices) Or was this just a simplification of what you're actually trying to do? – Joachim Isaksson Apr 03 '16 at 06:21
  • 1
    that's a reasonable thing in this simplified example. But lets assume you're making an allocator of some sort and you returned 2 unjoined slices to clients and they both got freed, so you'd like to be able to join them into a bigger slice. In that case you're bound to have lost track of where each slice came from nor have the original item since it has dozens of other slices borrowed against it through split_at_mut – hellcatv Apr 03 '16 at 06:37
  • The exact situation in question is here: https://github.com/dropbox/rust-alloc-no-stdlib/blob/master/src/stack_allocator.rs in free_cell where &'a mut slices are being returned to the system and it would be nice to see if they could be recombined with other free'd slices to unify them – hellcatv Apr 03 '16 at 06:39

2 Answers2

7

There is no such function in the standard library, probably since it is a rather niche case which can usually be resolved by using the slice that was split in the first case.

That being said, with a bit of unsafe it is possible to implement the function.

fn join_mut<'a, T>(first: &'a mut [T], second: &'a mut [T]) -> Option<&'a mut [T]> {
    let fl = first.len();
    if first[fl..].as_mut_ptr() == second.as_mut_ptr() {
        unsafe {
            Some(::std::slice::from_raw_parts_mut(first.as_mut_ptr(), fl + second.len()))
        }
    }
    else {
        None
    }
}

Playground

Mar
  • 166
  • 1
  • 2
  • For the case that these were actually from the same slice this code makes perfect sense to me. However, in C++ comparing equality of pointers from different arrays is undefined behavior (I think to compensate for segmented architectures). Is this defined behavior in Rust? – hellcatv Apr 03 '16 at 19:11
  • I had no idea that it can be undefined behavior to compare pointers in C++ in some cases. Not sure how Rust defines it so I am afraid someone more knowledgeable will need to answer that question. It seems really counter intuitive to me but might be that my code actually contains undefined behaviour in that case. – Mar Apr 03 '16 at 20:25
  • 2
    @hellcatv I found this question http://stackoverflow.com/questions/4909766/is-it-unspecified-behavior-to-compare-pointers-to-different-arrays-for-equality which says that it is unspecified for <, >, etc but == and != returns a defined result so I believe that at least for C++ there is no undefined behavior when checking the equality of two pointers. Probably Rust does not make this unspecified either. – Mar Apr 03 '16 at 20:44
  • (For posterity) this is unsound; see my answer. – Chayim Friedman Jun 20 '22 at 22:22
3

I don't know what was the situation back when the question was asked in 2016 (even though I suspect it was the same, despite the UB rules being much less clear), but I'm posting here for future people that will have the same question.


This function does not exist because it cannot be written soundly.

First, we cannot know if it is valid to join pointers. Being adjacent in memory is not enough; they may be from different objects and just happen to have adjacent addresses:

let mut a = [1, 2, 3];
let mut b = [4, 5, 6];
// `a` and `b` may be contiguous on stack, but joining them is immediate undefined behavior.

This already rules out the possibility to write a safe function, because we cannot check all preconditions (i.e. that the slices are from the same allocated object).


Note: The following refers to the current iteration of Stacked Borrows. It is possible that future models will render it incorrect, especially w.r.t shared references. Also, the rules are non-normative, although I think there is a consensus.

But even writing an unsafe function you cannot use references and must use raw pointers. In this code:

let mut array = [1, 2, 3, 4, 5, 6];
let array_ref: &mut [i32] = &mut array;
let (part1, part2): (&mut [i32], &mut [i32]) = array.split_at_mut(3);

It is immediate undefined behavior to construct a reference from either part1 or part2 that overlaps with the other part. This is because of provenance. Each pointer has a "hidden" provenance value that exists only on the Abstract Machine and represents the area this pointer has access to. array_ref has provenance over the whole array, so accessing any element of it is fine. However, part1 and part2 only has provenance over half of the array; accessing any elements other than their half is undefined behavior.

It is impossible to recover provenance with references. Once I have a reference r, all references derived from it will have its provenance or smaller. Growing the provenance is not possible: if you access an element outside of this range, or construct a reference that points out of this range (even if you never access it), you trigger undefined behavior.

We also cannot make the provenance of part1 and part2 big enoguh to hold the entire array, since then they will overlap, violating the borrow rules as they are both mutable.

So for join_mut() to be valid it needs to take raw pointers (which you can keep the provenance for, and also have no aliasing rules). This will necessitate using unsafe for it, and make it usable for only a small niche of people.

See also Safe slice rejoining - internals.rust-lang.org.

Chayim Friedman
  • 47,971
  • 5
  • 48
  • 77