6

I have C++ code that calls Rust code with data. It knows which object to send the data to. Here's an example of the Rust function that C++ calls back:

extern "C" fn on_open_vpn_receive(
    instance: Box<OpenVpn>,
    data: *mut c_uchar,
    size: *mut size_t,
) -> u8 

It receives the pointer as a Box, so I created a function openvpn_set_rust_parent that sets which object the C++ must call back. This object is a pointer to itself. I'm using Pin so the Box is not reallocated to somewhere else, making C++ call an invalid address.

impl OpenVpn {
    pub fn new() -> Pin<Box<OpenVpn>> {
        let instance = unsafe { interface::openvpn_new(profile.as_ptr()) };
        let o = OpenVpn { instance: instance };
        let p = Box::pin(o);
        unsafe {
            interface::openvpn_set_rust_parent(o.instance, p.as_ptr());
        };
        p
    }
}

Signature:

pub fn openvpn_set_rust_parent(instance: *mut OpenVpnInstance, parent: *mut OpenVpn)

I don't know how to transform p into *mut OpenVpn to pass to C++. Is my idea ok? I think the usage of Pin is good here, and I think this is a good way of calling the object from C++.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Guerlando OCs
  • 1,886
  • 9
  • 61
  • 150
  • 4
    "**Important.** At least at present, you should avoid using Box types for functions that are defined in C but invoked from Rust." - from [the `std::boxed` docs](https://doc.rust-lang.org/std/boxed/index.html). So don't write this signature in the first place. – Sebastian Redl Nov 11 '20 at 08:50
  • 3
    @SebastianRedl While that is true (and interesting, thanks for the link), to be extremely pedantic, this question appears to ask about a function defined in *Rust* that is invoked from C(++). It seems like the same logic may apply going the other way, though. – trent Nov 11 '20 at 14:44
  • 1
    @trentcl Ah, true. I misunderstood that part of the post. And no, the logic *specifically* does not go the other way. – Sebastian Redl Nov 12 '20 at 10:18
  • *It receives the pointer as a `Box`* — this is **highly** suspect, as that means that the function must be called **once and exactly once**. If it's called zero times, you have a memory leak. If it's called twice you will be using memory after it has been freed. Considering that you return `p` from the function, that means that as soon as the callback is triggered, any Rust code that accesses `p` will cause undefined behavior. Ditto if `p` is dropped by the Rust code before the callback occurs. – Shepmaster Nov 14 '20 at 01:52
  • @Shepmaster C++ code is created and destroyed by `OpenVpn`, so C code is like it was owned by `OpenVpn`. The `openvpn_set_rust_parent` just sets a callback inside the C++ class so it knows which object (its parent) to call. Calling twice or more just sets the callback again. Also there's no way of it calling nothing because it calls its parent, which always lives more than it. So, given these facts, I think it makes sense to call `openvpn_set_rust_parent` and still return `p`. What you think? – Guerlando OCs Nov 14 '20 at 13:43

1 Answers1

7

It doesn't matter. Pin isn't a deeply magical type that forces your value to never move. Really, it boils down to strongly-worded documentation and some guide rails that prevents you from doing bad things within safe Rust code. Pin can be circumvented by unsafe code, which includes any FFI code.

Having the Pin inside your Rust code might help you keep the Rust code accurate and valid, but it has nothing useful to add for the purposes of calling Rust from other languages.

Pin is defined as repr(transparent), which means that you can use it in your FFI signature as long as the inner type is safe to use in FFI:

#[stable(feature = "pin", since = "1.33.0")]
#[lang = "pin"]
#[fundamental]
#[repr(transparent)]
#[derive(Copy, Clone)]
pub struct Pin<P> {
    pointer: P,
}

I'm using Pin so the Box is not reallocated to somewhere else, making C++ call an invalid address.

Pin doesn't do this, Box does this. When you box something, you move the value to the heap. The Box itself is just a pointer. The address of the pointer will move around, but the address of the data in the heap will not.

Note that the second address (0x55..30, on the heap) printed is the same, even though the Box itself has moved:

fn main() {
    let a = 42;

    let b = Box::new(a);
    println!("{:p}", &b);  // 0x7ffe3598ea80
    println!("{:p}", &*b); // 0x556d21528b30

    let c = b;
    println!("{:p}", &c);  // 0x7ffe3598ea88
    println!("{:p}", &*c); // 0x556d21528b30
}

See also:

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
  • It's always been a little unclear to me exactly what aspects of a `#[stable]` item have been stabilised; I presume from this that the `repr` is guaranteed to be stable? – eggyal Nov 13 '20 at 22:10
  • So since `Box` is safe to use in FFI does it mean Pin> is? I should just call `unsafe { interface::openvpn_set_rust_parent(o.instance, p); }; p`? Ps: I cannot do that because I'm moving p to `openvpn_set_rust_parent` – Guerlando OCs Nov 13 '20 at 23:01
  • @eggyal I don't know if stability and things like `repr` have been explicitly written down anywhere, but like many things, it's generally a one-way street. I forget the exact type now, but there was a stdlib type that had no `repr` attribute (and thus was `repr(Rust)`) that did change to be `repr(transparent)`. That was fine because it went from "undefined layout" to "defined layout", but that couldn't change _back_ without breaking things. I would be comfortable assuming that `Pin` will forever be `repr(transparent)` unless some memory-unsafety bug were discovered. – Shepmaster Nov 14 '20 at 01:44
  • @GuerlandoOCs as stated in the comments on your question, while `Box` is valid to put in the FFI signature, that doesn't mean that every use of it is correct. For example, you can't allocate in C and pass the pointer to Rust as `Box` because the allocators would mismatch. `Pin>` is safe to use in FFI in the same places that `Box` is – Shepmaster Nov 14 '20 at 01:59