4

I wrote some Rust code that provides a FFI for some C code, which I recently discovered a bug in. Turns out unsafe is hard and error prone — who knew! I think I've fixed the bug but I am curious to understand the issue more.

One function took a Vec, called into_boxed_slice on it and returned the pointer (via as_mut_ptr) and length to the caller. It called mem:forget on the Box before returning.

The corresponding "free" function only accepted the pointer and called Box::from_raw with it. Now this is wrong, but the amazing thing about undefined behaviour is that it can work most of the time. And this did. Except if the source Vec was empty when it would segfault. Also of note, MIRI correctly identifies the issue: "Undefined Behavior: inbounds test failed: 0x4 is not a valid pointer".

Anyway the fix was to take the length in the free function as well, reconstitute the slice, then Box::from_raw that. E.g. Box::from_raw(slice::from_raw_parts_mut(p, len))

I've tried to capture all of this in this playground: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=7fe80cb9f0c5c1eee4ac821e58787f17

Here's the playground code for reference:

use std::slice;

fn main() {
    // This one does not crash
    demo(vec![1]);
    
    // These do not crash
    hopefully_correct(vec![2]);
    hopefully_correct(vec![]);
    
    // This one seg faults
    demo(vec![]);
}

// MIRI complains about UB in this one (in Box::from_raw)
fn demo(v: Vec<i32>) {
    let mut s: Box<[i32]> = dbg!(v.into_boxed_slice());
    let p: *mut i32 = dbg!(s.as_mut_ptr());
    assert!(!p.is_null());
    std::mem::forget(s);
    // Pretend the pointer is returned to an FFI interface here
    
    // Imagine this is the free function counterpart to the imaginary FFI.
    unsafe { Box::from_raw(p) };
}

// MIRI does not complain about this one
fn hopefully_correct(v: Vec<i32>) {
    let mut s: Box<[i32]> = dbg!(v.into_boxed_slice());
    let p: *mut i32 = dbg!(s.as_mut_ptr());
    let len = s.len();
    assert!(!p.is_null());
    std::mem::forget(s);
    // Pretend the pointer is returned to an FFI interface here
    
    // Imagine this is the free function counterpart to the imaginary FFI.
    unsafe { Box::from_raw(slice::from_raw_parts_mut(p, len)) };
}

I've looked through the Box source and done a bunch of searching but it's unclear to me how rebuilding the slice helps. It would seem that the pointers are the same but there is some empty optimisation handled properly in the fixed example somewhere, possibly as part of Unique?

Can anyone explain what's going on here?

I found these three links useful but not enough to answer my query:

Wes
  • 2,166
  • 1
  • 20
  • 22

1 Answers1

3

That's because when you deconstruct your empty vector, you get a null pointer and a zero length.

When you call Box::from_raw (null), you break one of the box invariants: "Box<T> values will always be fully aligned, non-null pointers". Then when Rust drops the box, it attempts to deallocate the null pointer.

OTOH when you call slice::from_raw_parts, Rust allocates a new fat pointer that contains the null pointer and the zero length, then Box::from_raw stores a reference to this fat pointer in the Box. When dropping the box, Rust first drops the slice (which knows that a length of zero means a null data that doesn't need to be freed), then frees the memory for the fat pointer.

Note also that in the non-working case you reconstruct a Box<i32>, whereas in the working case you reconstruct a Box<[i32]>, as shown if you try to compile the following code:

use std::slice;

fn demo(v: Vec<i32>) {
    let mut s: Box<[i32]> = dbg!(v.into_boxed_slice());
    let p: *mut i32 = dbg!(s.as_mut_ptr());
    assert!(!p.is_null());
    std::mem::forget(s);
    // Pretend the pointer is returned to an FFI interface here
    
    // Imagine this is the free function counterpart to the imaginary FFI.
    let _b: () = unsafe { Box::from_raw(p) };
}

// MIRI does not complain about this one
fn hopefully_correct(v: Vec<i32>) {
    let mut s: Box<[i32]> = dbg!(v.into_boxed_slice());
    let p: *mut i32 = dbg!(s.as_mut_ptr());
    let len = s.len();
    assert!(!p.is_null());
    std::mem::forget(s);
    // Pretend the pointer is returned to an FFI interface here
    
    // Imagine this is the free function counterpart to the imaginary FFI.
    let _b: () = unsafe { Box::from_raw(slice::from_raw_parts_mut(p, len)) };
}

Playground

Jmb
  • 18,893
  • 2
  • 28
  • 55
  • "Rust allocates a new fat pointer", does this require a heap allocation? – Wes Jul 04 '20 at 23:40
  • No, [`Box` is a lang-item](https://doc.rust-lang.org/src/alloc/boxed.rs.html#156) so it is special-cased in the compiler. You can see that because [a slice box is twice the size of a regular box](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=3d3bdb2f8ca20972c96f1435c72b559e), meaning that the fat pointer is stored directly inside the box with no extra indirection. – Jmb Jul 06 '20 at 06:24