2

I'm writing Rust bindings for a C API. This particular C API has two functions: f1 and f2. f1 returns a handle that references internal data that's valid until f2 is called1.

What are my options for modeling the handle's lifetime constraints? Preferably enforceable at compile time, though if that is not possible at all I can live with establishing correctness at runtime as well.

A solution can assume the following restrictions:

  • Each call to f1 needs to be followed by a call to f2 before calling f1 again. In other words, there cannot ever be two or more consecutive calls to either function.
  • All functions are called from the same thread.

Things I tried

I had looked into using the PhantomData marker struct, though that won't work here as I don't have access to the underlying data referenced by the handle.

Another option I had played around with was removing f2 from the public API surface altogether, and have clients pass a function into f1 that can safely assume a valid handle:

pub fn f1(f: fn(h: &Handle) -> ()) {
    let h = unsafe { api::f1() };
    // Execute client-provided code
    f(&h);
    unsafe { api::f2() };
}

While that works in enforcing lifetime constraints by never allowing the Handle to escape f1 (I think), it feels like it's taking away too much control from clients. This is library code and I'd rather not turn this into a framework.

Another alternative I had considered was having clients move the handle into f2 to transition ownership back into the library implementation:

pub fn f2(_h: Handle) {
    unsafe { api::f2() };
}

That, too, seems to work (I think), although it introduces a seemingly unrelated parameter into f2's signature, making for a somewhat confusing API.

Question

What's the (canonical) solution here that I cannot see?


1 f2 isn't strictly cleanup code. It is called for different reasons, and only invalidates the reference returned by f1 as a side effect.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
IInspectable
  • 46,945
  • 8
  • 85
  • 181
  • 6
    Both patterns are fairly common and support your use case. I'd suggest the last option though. *"A handle... that's valid until f2 is called"* sounds exactly like a *consuming* function call and would communicate at compile-time that the handle isn't valid afterwards. The function pattern is more restrictive since everything must be done at once (but that may be more appropriate depending on the intended usage). – kmdreko Dec 22 '20 at 09:34
  • 5
    I don't think the second option makes for a confusing API. If `f2()` eats the handle, it should take ownership of it, regardless of whether this happens as a side effect. In my opinion, it makes the API _clearer_. – Sven Marnach Dec 22 '20 at 09:47
  • 3
    [Is there a way to enforce that a Rust raw pointer is not used after returning from a specific stack frame?](https://stackoverflow.com/q/61106587/155423); [Adding lifetime constraints to non-reference types](https://stackoverflow.com/q/28174681/155423) – Shepmaster Dec 22 '20 at 15:26
  • Maybe I misunderstand something, but I do not see how the second solution forces "Each call to f1 needs to be followed by a call to f2 before calling f1 again"; it allows not calling `f2` and simply dropping the handle instead. Or e.g. `let h1 = f1(); let h2 = f1(); f2(h1); f2(h2);`. _If_ my understanding is correct, I'd prefer option 1, and maybe `unsafe` versions of `f1` and `f2` with option 2. – Alexey Romanov Dec 22 '20 at 18:48

1 Answers1

0

You can define two structs:

struct FirstState { ... }
struct SecondState { ... }

Then you will be able to define methods that transform each into the other:

impl FirstState {
    // note: Takes ownership of self
    pub fn into_second(self) -> SecondState {
        api::f2();
        SecondState { ... }
    }
}

impl SecondState {
    // note: Takes ownership of self
    pub fn into_first(self) -> FirstState {
        api::f1();
        FirstState { ... }
    }
}

Since each conversion takes ownership of the object, this enforces that you must alternate between calling f1 and f2. Additionally you can define methods like this:

impl FirstState {
    pub fn get_internal_data(&self) -> &InternalData {
        ...
    }
}

The signature of get_internal_data enforces that the returned reference cannot outlive FirstState, even if the data is not stored inside FirstState itself.

Alice Ryhl
  • 3,574
  • 1
  • 18
  • 37
  • Isn't this essentially the same as my final option, though, i.e. manually terminating the lifetime by passing ownership? Although I'm not sure I understand where the first `FirstState` object came from, or what's there to prevent that from being called a second time before transitioning `into_second`. – IInspectable Dec 25 '20 at 16:41
  • You would have to build your own constructor. There is no way to verify at compile time that you only build one of them, so the best you can do is a global that panics on second creation. – Alice Ryhl Dec 25 '20 at 16:55