9

I don't have a particularly solid understanding of Rust's aliasing rules (and from what I've heard they're not solidly defined), but I'm having trouble understanding what makes this code example in the std::slice documentation okay. I'll repeat it here:

let x = &mut [1, 2, 4];
let x_ptr = x.as_mut_ptr();

unsafe {
    for i in 0..x.len() {
        *x_ptr.offset(i as isize) += 2;
    }
}
assert_eq!(x, &[3, 4, 6]);

The problem I see here is that x, being an &mut reference, can be assumed to be unique by the compiler. The contents of x get modified through x_ptr, and then read back via x, and I see no reason why the compiler couldn't just assume that x hadn't been modified, since it was never modified through the only existing &mut reference.

So, what am I missing here?

  • Is the compiler required to assume that *mut T may alias &mut T, even though it's normally allowed to assume that &mut T never aliases another &mut T?

  • Does the unsafe block act as some sort of aliasing barrier, where the compiler assumes that code inside it may have modified anything in scope?

  • Is this code example broken?

If there is some kind of stable rule that makes this example okay, what exactly is it? What is its extent? How much should I worry about aliasing assumptions breaking random things in unsafe Rust code?

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
lcmylin
  • 2,552
  • 2
  • 19
  • 31
  • I think that it's LLVM who handle this and as `x` and `x_ptr` contain the address of the same type, LLVM must reload `x` – Stargateur Sep 28 '18 at 04:52
  • @Stargateur Really? I was under the impression that type-based alias analysis allowed LLVM to make stronger assumptions about the disjointness of objects of the same type in memory. – lcmylin Sep 28 '18 at 05:13
  • 1
    @Mylin: From memory, TBAA is opt-in (the front-end needs to emit specific attributes) and rustc doesn't opt-in. Instead it uses per-variable annotations. – Matthieu M. Sep 28 '18 at 09:01
  • Indeed, Rust does NOT do any reasoning based on pointee type (except for checking for interior mutability). So what @Stargateur wrote is incorrect for Rust. – Ralf Jung Sep 30 '18 at 13:33

1 Answers1

9

Disclaimer: there is no formal memory model, yet.1

First of all, I'd like to address:

The problem I see here is that x, being an &mut reference, can be assumed to be unique by the compiler.

Yes... and no. x can only be assumed to be unique if not borrowed, an important distinction:

fn doit(x: &mut T) {
    let y = &mut *x;
    //  x is re-borrowed at this point.
}

Therefore, currently, I would work with the assumption that deriving a pointer from x will temporarily "borrow" x in some sense.

This is all wishy washy in the absence of a formal model, of course, and part of the reason why the rustc compiler is not too aggressive with aliasing optimizations yet: until a formal model is defined, and code is checked to match it, optimizations have to be conservative.

1 The RustBelt project is all about establishing a formally proven memory model for Rust. The latest news from Ralf Jung were about a Stacked Borrows model.


From Ralf (comments): the key point in the above example is that there is a clear transfer from x to x_ptr and back to x again. So the x_ptr is a scoped borrow in a sense. Should the usage go x, x_ptr, back to x and back to x_ptr, then the latter would be Undefined Behavior:

fn main() {
    let x = &mut [1, 2, 4];
    let x_ptr = x.as_mut_ptr(); // x_ptr borrows the right to mutate

    unsafe {
        for i in 0..x.len() {
            *x_ptr.offset(i as isize) += 2; // Fine use of raw pointer.
        }
    }
    assert_eq!(x, &[3, 4, 6]);  // x is back in charge, x_ptr invalidated.

    unsafe { *x_ptr += 1; }     // BÄM! Used no-longer-valid raw pointer.
}
Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • 2
    Indeed, the key point is that `x_ptr` is *derived from* `x` AND `x` has not been used since `x_ptr` was created. Both of these have to be true for this code to be correct. – Ralf Jung Sep 30 '18 at 13:34
  • 1
    Might be worth adding an example like https://play.rust-lang.org/?gist=c830d6933236d2a23f833de664209c40&version=stable&mode=debug&edition=2015 showing that using `x_ptr` again after `x` was used is *not* allowed. – Ralf Jung Sep 30 '18 at 14:05
  • @RalfJung It's """not allowed""", yet `assert_eq!(x, &[3, 4, 6]);` right after the last line fails and tells that it has changed to `4, 4, 6`... So are we back to having issues which Rust was built to avoid, by simply not even defining what's correct and what's not? If compiler doesn't optimize (I built it on release mode, same thing) it right now to not break it (it seems to, which is why I get correct results as if I was using C), then what's even the point of these arbitrary rules? This is major pain point for me, when it's impossible to figure out what's wrong and what's okay to do... –  Aug 04 '19 at 18:43
  • To be completely clear: `&mut + &mut = compile error`, this is the only thing that's obvious... I got there by trying to figure out whether `&mut + *mut = wrong` (here we claim that it's wrong), and whether `*mut + *mut = wrong` (I have yet to find anything mentioning it). If there's no clear rules set yet, then is this UB, but not UB(tm)? –  Aug 04 '19 at 18:48
  • 1
    It's UB, and exploited in some cases. Some optimizations are temporarily [disabled because of LLVM bugs](https://stackoverflow.com/questions/57259126/why-does-the-rust-compiler-not-optimize-code-assuming-that-two-mutable-reference). But just because the compiler does not currently recognize your code in the maximal possible way, doesn't mean it won't get better in the future. You can't expect the compiler to do all the most aggressive optimizations from the start. In C the usual approach seems to be "optimize until someone complains it's wrong"; we'd like to first be sure we know what's right. – Ralf Jung Aug 05 '19 at 20:40
  • 1
    @Sahsahae also, *unsafe* Rust does indeed share many of the problems of Undefined Behavior with C and C++. The value in Rust lies in the ability to seal unsafety behind an abstraction, and localize it. Compare `std::vector` in C++ and `Vec` in Rust: their *implementation* is very similar, and it is equally dangerous in both languages. But as a *user* there is a huge difference: in C++ you have to worry about iterator invalidation etc. all the time, in Rust you know the compiler got your back. – Ralf Jung Aug 05 '19 at 20:42