6

I've created rust bindings to a C library and currently writing safe wrappers around it.

The question is about C functions which take in C function pointers which are not able to take in any custom user data.

It is easier to explain it with an example,

C Library:

// The function pointer I need to pass,
typedef void (*t_function_pointer_library_wants)(const char *argument);
// The function which I need to pass the function pointer to,
void register_hook(const t_function_pointer_library_wants hook);

Bindings:

// For the function pointer type
pub type t_function_pointer_library_wants = ::std::option::Option<unsafe extern "C" fn(argument: *const ::std::os::raw::c_char)>;
// For the function which accepts the pointer
extern "C" {
    pub fn register_hook(hook: t_function_pointer_library_wants);
}

It would have been very nice if I could expose an api to the user like the following,

// Let's assume my safe wrapper is named on_something
// ..
on_something(|argument|{
    // Do something with the argument..
});
// ..

although according to the sources below, the lack of ability to hand over the management of the part of memory which would store my closure's state to C, prevents me to create this sort of API. Because the function pointer in C is stateless and does not take in any user data of some sort. (Please correct me if I'm wrong.)

I've come to this conclusion by reading these sources and similar ones:

Trampoline Technique

Similar Trampoline Technique

Hacky Thread Local Technique

Sources in Shepmaster's answer

As a fallback, I maybe can imagine an API like this where I pass a function pointer instead.

fn handler(argument: &str) {
    // Do something with the argument..
}
//..
on_something(handler);
//..

But I am a little confused about converting an fn(&str),

to an unsafe extern "C" fn(argument: *const std::os::raw::c_char)..

I'd be very glad if you could point me to the right direction.

* The actual library in focus is libpd and there is an issue I've created related to this.

Thanks a lot.

Ali Somay
  • 585
  • 8
  • 20

1 Answers1

3

First off, this is a pretty hard problem to solve. Obviously, you need some way to pass data into a function outside of its arguments. However, pretty any method of doing that via a static could easily result in race conditions or worse, depending on what the c library does and how the library is used. The other option is to JIT some glue code that calls your closure. At first glance, that seems even worse, but libffi abstracts most of that away. A wrapper using the libffi crate would like this:

use std::ffi::CStr;
use libffi::high::Closure1;

fn on_something<F: Fn(&str) + Send + Sync + 'static>(f: F) {
    let closure: &'static _ = Box::leak(Box::new(move |string: *const c_char| {
        let string = unsafe { CStr::from_ptr(string).to_str().unwrap() };
        f(string);
    }));
    let callback = Closure1::new(closure);
    let &code = callback.code_ptr();
    let ptr:unsafe extern "C" fn (*const c_char) = unsafe { std::mem::transmute(code) };
    std::mem::forget(callback);
    unsafe { register_handler(Some(ptr)) };
}

I don't have a playground link, but it worked fine when I tested it locally. There are two important things to note with this code:

  1. It's maximally pessimistic about what the c code does, assuming the function is repeatedly called from multiple threads for the entire duration of the program. You may be able to get away with fewer restrictions, depending on what libpd does.

  2. It leaks memory to ensure the callback is valid for the life of the program. This is probably fine since callbacks are typically only set once. There is no way to safely recover this memory without keeping around a pointer to the callback that was registered.

It's also worth noting that the libffi::high::ClosureMutN structs are unsound, as they permit aliasing mutable references to the passed wrapped closure. There is a PR to fix that waiting to be merged though.

Aiden4
  • 2,504
  • 1
  • 7
  • 24
  • First of all thanks a lot for the detailed answer. It is a surprise that I didn't come across this crate when I was researching. It is almost tailored for the problem I'm trying to solve. I gave your solution a spin, although I get a `signal: 11, SIGSEGV: invalid memory reference` as soon as the library calls the function. `Box::leak` should prevent dropping the closure in the end of scope and `transmute` should be also fine also if we're trusting `libffi` providing us an equivelent type for the function pointer expected. Any ideas? – Ali Somay Jan 25 '22 at 20:51
  • 1
    @AliSomay the code I used to test it is [this](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=c037fa0f16df5a4534d80a13b111374d). I'll write some actual c and dig into the issue some more, but do make sure that the pointer you pass in is a null-terminated string. – Aiden4 Jan 25 '22 at 21:11
  • I've copied your code and make some tests myself also. It works perfectly. I see that libpd declares a `char buffer[SOME_SIZE]` fills it with [vsnprintf](https://www.cplusplus.com/reference/cstdio/vsnprintf/) and then calls `strcat(buffer, "\n");` finally before passing a `*char` to the function pointer. "strcat will look for the null-terminator, interpret that as the end of the string, and append the new text there, overwriting the null-terminator in the process, and writing a new null-terminator at the end of the concatenation." I think the string passed in is null terminated. – Ali Somay Jan 25 '22 at 21:39
  • You may find an example section where libpd prepares the string [here](https://github.com/pure-data/pure-data/blob/209347cbb26bb7ecfcb2cbdd1c129dd3e487a7e1/src/s_print.c#L178) and where it calls the function with a `*char` [here](https://github.com/pure-data/pure-data/blob/209347cbb26bb7ecfcb2cbdd1c129dd3e487a7e1/src/s_print.c#L62). Maybe you would see something I miss. I'm also running the code in a test, but that shouldn't change anything. – Ali Somay Jan 25 '22 at 21:45
  • 1
    @AliSomay `mem::forget`ing the `closure` variable seems to reliably fix the issue, and I'm not entirely sure why. I'll dig into why later when I have more time. – Aiden4 Jan 25 '22 at 22:54
  • I've also set up a testing environment with real minimal C code. I think this all makes sense. Correct me if I'm wrong, the first boxing of the closure also shouldn't be necessary because the only thing we want to prevent rust from dropping is the `callback` because that is the piece of memory we hand over. I guess `std::mem::forget(callback)` would be the most risky one to use to make that happen. What about `Box::into_raw(Box::new(callback))` or `std::mem::ManuallyDrop::new(callback)`. I tried all those and they all work fine. – Ali Somay Jan 25 '22 at 23:19
  • What about [this](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=db894c6e96710e13c75eb6cf12f7817c) piece of code? – Ali Somay Jan 25 '22 at 23:19
  • By the way, I think we should continue the discussion to remove unnecessary parts of the code in this answer or try to replace some techniques with better ones. Apart from all that, in the test environment and with the real library currently it works! You were very helpful and I think this deserves to be the accepted answer. – Ali Somay Jan 25 '22 at 23:27
  • 1
    @AliSomay the closure has to be boxed since `callback` refers to it, otherwise it will be a use-after-free whenever the callback is called. `Closure1` frees the allocations made by libffi when dropped, which was what caused the segfaults with previous versions of the code. Since the closure only maintains a pointer to data allocated by libffi, just `mem::forget`ing it is fine (and saves you some memory), but leaking the data on the heap is fine, although I would recommend `Box::leak` over `Box::into_raw`. – Aiden4 Jan 26 '22 at 05:32