2

In my mind one of the ideal traits for a dependency injection container would look like:

pub trait ResolveOwn<T> {
    fn resolve(&self) -> T;
}

I don't know how to implement this for certain T. I keep stubbing my toes on variations and cousins of this error:

error[E0515]: cannot return value referencing local variable `X`

I'm used to dependency injection in C# where returning values referencing local variables is precisely how you implement the equivalent of that resolve function.

Here's an illustration that focuses on this aspect of dependency injection:

struct ComplexThing<'a>(&'a i32);

struct Module();

impl Module {
    fn resolve_foo(&self) -> i32 {
        todo!()
    }

    pub fn resolve_complex_thing_1(&self) -> ComplexThing {
        let foo = self.resolve_foo();
        ComplexThing(&foo)
    }
}
error[E0515]: cannot return value referencing local variable `foo`
  --> src/lib.rs:12:9
   |
12 |         ComplexThing(&foo)
   |         ^^^^^^^^^^^^^----^
   |         |            |
   |         |            `foo` is borrowed here
   |         returns a value referencing data owned by the current function

See? There's that error.

My first instinct (again, coming from C#) is to give the local variable a place to live in the returned value, because the local variable is created here but it needs to live at least as long as the returned value. Hmm... that sounds sort of like returning a closure. Let's see how that goes...

pub fn resolve_complex_thing_2<'a>(&'a self) -> impl FnOnce() -> ComplexThing<'a> {
    let foo = self.resolve_foo();
    move || ComplexThing(&foo)
}
error[E0515]: cannot return value referencing local data `foo`
  --> src/lib.rs:12:17
   |
12 |         move || ComplexThing(&foo)
   |                 ^^^^^^^^^^^^^----^
   |                 |            |
   |                 |            `foo` is borrowed here
   |                 returns a value referencing data owned by the current function

No joy. It doesn't work to package this closure up into a prettier type (like some impl of Into<ComplexThing<'a>>) because it's fundamentally about returning a value referencing local data.

My next instinct is to somehow jam the local data into some kind of weak cache inside my Module and then get a reference from there (undoubtedly unsafely). And then the weak cache will need to solve half of the hard problems in Computer Science (hint: the other hard problem is naming things). That's starting to sound an awful lot like... oh no. Garbage collection!

I also thought about inverting the flow of control. It's hideous and still doesn't work:

impl Module {
    pub fn use_foo<T>(&self, f: impl FnOnce(i32) -> T) -> T {
        (f)(42)
    }

    pub fn use_complex_thing<'a, T>(&'a self, f: impl FnOnce(ComplexThing<'a>) -> T) -> T {
        self.use_foo(
            |foo| (f)(ComplexThing(&foo)),
        )
    }
}
error[E0597]: `foo` does not live long enough
  --> src/lib.rs:12:36
   |
10 |     pub fn use_complex_thing<'a, T>(&'a self, f: impl FnOnce(ComplexThing<'a>) -> T) -> T {
   |                              -- lifetime `'a` defined here
11 |         self.use_foo(
12 |             |foo| (f)(ComplexThing(&foo)),
   |                   -----------------^^^^--
   |                   |                |    |
   |                   |                |    `foo` dropped here while still borrowed
   |                   |                borrowed value does not live long enough
   |                   argument requires that `foo` is borrowed for `'a`

My last instinct is to hack around the restriction against moving a value with active borrows, because then I could trick the compiler. My attempts at implementing that resulted in a type that's impossible to use correctly — it ended up requiring knowledge that only the compiler has and seemed to introduce undefined behavior at every turn. I won't bother reproducing that code here.

It seems like it's impossible to return any owned instances of types containing (non-singleton) references.

Assuming that's true, that means there are entire classes of types that simply cannot be created with a dependency injection container in Rust.

Surely I'm missing something?

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Matt Thomas
  • 5,279
  • 4
  • 27
  • 59
  • Is there something stopping you from moving the value into the container (e.g. changing `ComplexThing` to `i32` instead of `&i32`) instead of taking a reference? – Aplet123 Jan 13 '21 at 21:50
  • @Aplet123 I'll answer that along with the related question "can't you just change the &i32 to an Rc?"--some types in the wild hold references and there's nothing I can do to modify them. So in this specific case yes I can just modify my `ComplexThing`, but that just pushes the problem down the road to the next type that holds references – Matt Thomas Jan 13 '21 at 21:55
  • 1
    Dependency injection is **quite** common. It's one of the biggest reasons for generics. See [How can I test Rust methods that depend on environment variables?](https://stackoverflow.com/q/35858323/155423); [Is there a cleaner way to test functions that use functions that require user input in Rust?](https://stackoverflow.com/q/47077925/155423); [How can I test stdin and stdout?](https://stackoverflow.com/q/28370126/155423). – Shepmaster Jan 13 '21 at 22:01
  • @Shepmaster Perhaps you could suggest a better title for my question, because I don't think those get to the heart of it. For example, [the most popular DI frameworks in Rust](https://crates.io/keywords/dependency-injection) appear to only support structs containing `Rc` and `Arc`, not plain references. So it appears it _is_ impossible for my `ResolveOwn` trait to be implemented for _all_ `T`. Meaning a DI container is impossible to make that handles all `T` – Matt Thomas Jan 13 '21 at 22:04
  • @Shepmaster cont'd Is the alternative to just instantiate everything in one big tree in `main`? – Matt Thomas Jan 13 '21 at 22:05
  • 2
    *dependency injection in C# where returning values referencing local variables is precisely how you implement* I don’t know C# that well, but I doubt it, at least not for the same meanings. C# has a garbage collector, so you can return a value and the GC takes care of it. That’s closest to Rust’s `Arc`. – Shepmaster Jan 13 '21 at 22:44
  • Yeah, the issue here is that C#'s semantics about values and ownership do not line up neatly with Rust's value and ownership semantics. As Shep says, C#'s semantics are more similar to wrapping anything in Rust in an `Arc`, and cloning anytime you need to hand out a reference. This is true except for Rust types that implement `Copy`. these are most similar to C# `struct`s. However, for this type of factory-based dependency injection, you just don't return a *reference*, you pass ownership. If the other struct only contains a reference, then you need to figure out who needs to own the `T` – Zarenor Jan 14 '21 at 01:31
  • 1
    It looks like your question might be answered by the answers of [Why can't I store a value and a reference to that value in the same struct?](https://stackoverflow.com/q/32300132/155423); [Is there any way to return a reference to a variable created in a function?](https://stackoverflow.com/q/32682876/155423). If not, please **[edit]** your question to explain the differences. Otherwise, we can mark this question as already answered. – Shepmaster Jan 14 '21 at 02:09
  • @Shepmaster Thank you for your help with my question. I think your first title edit was excellent and captured the spirit of my question. I'm still thinking about the second one though. If dependency injection containers necessarily lead to error[E0515] then I like this new title and my question is indirectly answered by such answers and DI containers are impossible in Rust (at least they only work for _some_ T, not _all_ T). But I'm not sure DI containers in Rust necessarily lead to that error – Matt Thomas Jan 14 '21 at 02:19
  • 6
    DI containers **are** possible in Rust; you've linked to frameworks that could be called such yourself. However, *your specific issue* is around returning a reference to a local variable, you've just inexorably made that a requirement for the implementation of a DI container (incorrectly, I believe). Because *your* view of a DI container requires that specific circumstance, one that I do not believe is a part of the general definition of a DI container, it makes sense to highlight that difference up front. – Shepmaster Jan 14 '21 at 02:26
  • @Shepmaster Thank you. I think that clears it up in my head. Then yes I guess it is fundamentally about returning references to local variables, and the answers to "how do I return references to local variables?" questions are the answers to this one. I think what confused me the most is I have this really simple interface that cannot be implemented for a lot of `T` (without a lot of effort, if at all). That's definitely different than other languages. And I'm learning that's not necessarily a Bad Thing – Matt Thomas Jan 14 '21 at 13:24

2 Answers2

2

You can try making a drop guard along with Box::leak to leak a reference to live long enough, then have custom behavior on Drop to reclaim the leaked memory. Note that this will require you to do everything through the drop guard:

use std::marker::PhantomData;
use std::mem::ManuallyDrop;

struct ComplexThing<'a>(&'a i32);

struct Module;

pub struct DropGuard<'a, T: 'a, V: 'a> {
    // do NOT make these fields pub
    // direct manipulation of these is very unsafe
    container: ManuallyDrop<T>,
    value: *mut V,
    // I'm not sure this is needed but better safe than sorry
    _value: PhantomData<&'a mut V>,
}

impl<'a, T: 'a, V: 'a> DropGuard<'a, T, V> {
    pub fn new<F: FnOnce(&'a mut V) -> T>(value: Box<V>, gen: F) -> Self {
        // leak the value so it lives long enough
        let leaked = Box::leak(value);
        // get a pointer to know what to drop
        let leaked_ptr: *mut _ = leaked;
        DropGuard {
            container: ManuallyDrop::new(gen(leaked)),
            value: leaked_ptr,
            _value: PhantomData,
        }
    }
}

// so you can actually use it
// no DerefMut since dropping the container without dropping the guard is weird
impl<'a, T: 'a, V: 'a> std::ops::Deref for DropGuard<'a, T, V> {
    type Target = T;

    fn deref(&self) -> &Self::Target {
        &self.container
    }
}

impl<'a, T: 'a, V: 'a> Drop for DropGuard<'a, T, V> {
    fn drop(&mut self) {
        // drop the container first
        // this should be safe since self.container is never referenced again
        // the value its borrowing is still valid (due to not being dropped yet)
        // and there should be no references to it (due to this struct being dropped)
        unsafe {
            ManuallyDrop::drop(&mut self.container);
        }
        // now drop the pointer
        // this should be safe since it was created with Box::leak
        // and the container borrowing it has already been dropped
        // and no more references should have survived
        std::mem::drop(unsafe { Box::from_raw(self.value) });
    }
}

impl Module {
    pub fn resolve_foo(&self) -> i32 {
        5
    }

    pub fn resolve_complex_thing_1(&self) -> DropGuard<ComplexThing, i32> {
        DropGuard::new(Box::new(self.resolve_foo()), |i32_ref| {
            ComplexThing(i32_ref)
        })
    }
}

fn main() {
    let module = Module;
    let guard = module.resolve_complex_thing_1();
    println!("{:?}", guard.0);
}

Playground link

Another way that also cleans up the typing is to use a trait:

use std::marker::PhantomData;
use std::mem::ManuallyDrop;

struct ComplexThing<'a>(&'a i32);

struct Module;

// not sure if this trait should be unsafe
// but again, better safe than sorry
pub unsafe trait Guardable {
    type Value;
}

unsafe impl Guardable for ComplexThing<'_> {
    type Value = i32;
}

pub struct DropGuard<'a, T: 'a + Guardable> {
    // do NOT make these fields pub
    // direct manipulation of these is very unsafe
    container: ManuallyDrop<T>,
    value: *mut T::Value,
    // I'm not sure this is needed but better safe than sorry
    _value: PhantomData<&'a mut T::Value>,
}

impl<'a, T: 'a + Guardable> DropGuard<'a, T> {
    pub fn new<F: FnOnce(&'a mut T::Value) -> T>(value: Box<T::Value>, gen: F) -> Self {
        // leak the value so it lives long enough
        let leaked = Box::leak(value);
        // get a pointer to know what to drop
        let leaked_ptr: *mut _ = leaked;
        DropGuard {
            container: ManuallyDrop::new(gen(leaked)),
            value: leaked_ptr,
            _value: PhantomData,
        }
    }
}

// so you can actually use it
// no DerefMut since dropping the container without dropping the guard is weird
impl<'a, T: 'a + Guardable> std::ops::Deref for DropGuard<'a, T> {
    type Target = T;

    fn deref(&self) -> &Self::Target {
        &self.container
    }
}

impl<'a, T: 'a + Guardable> Drop for DropGuard<'a, T> {
    fn drop(&mut self) {
        // drop the container first
        // this should be safe since self.container is never referenced again
        // the value its borrowing is still valid (due to not being dropped yet)
        // and there should be no references to it (due to this struct being dropped)
        unsafe {
            ManuallyDrop::drop(&mut self.container);
        }
        // now drop the pointer
        // this should be safe since it was created with Box::leak
        // and the container borrowing it has already been dropped
        // and no more references should have survived
        std::mem::drop(unsafe { Box::from_raw(self.value) });
    }
}

impl Module {
    pub fn resolve_foo(&self) -> i32 {
        5
    }

    pub fn resolve_complex_thing_1(&self) -> DropGuard<ComplexThing> {
        DropGuard::new(Box::new(self.resolve_foo()), |i32_ref| {
            ComplexThing(i32_ref)
        })
    }
}

fn main() {
    let module = Module;
    let guard = module.resolve_complex_thing_1();
    println!("{:?}", guard.0);
}

Playground link

Since every container should only have one valid DropGuard value type, you can put that in an associated type in a trait, so now you can work with DropGuard<ComplexThing> instead of DropGuard<ComplexThing, i32>, and this also prevents you from having bogus values in the DropGuard.

Aplet123
  • 33,825
  • 1
  • 29
  • 55
  • Isn’t this just a re-implementation of `Rc` or `Arc`? – Shepmaster Jan 13 '21 at 23:53
  • @Shepmaster I thought about that, yet I can't really find a way to make a self-referential `Rc`, – Aplet123 Jan 13 '21 at 23:59
  • I think `DropGuard` can lead to double dropping if you do `DropGuard::new(boxed_thing, |thing_ref| thing_ref)`... then both the `container` and the `value` will have the same pointer. `DropGuard` is the kind of thing that ends up "requiring knowledge that only the compiler has" to work for all possible `T` and `V`. _But_ it would help accomplish the goal. It could always be made unsafe and documented very clearly that here there be monsters, and then I would just be careful to use it properly. Thank you! – Matt Thomas Jan 14 '21 at 02:04
  • Oh and "this will require you to do everything through the drop guard" isn't really a problem. Because callers can compose with a `DropGuard` as well. I think this just might work... – Matt Thomas Jan 14 '21 at 02:06
  • 1
    @MattThomas I believe that the double dropping won't happen, as in your case the container will be a reference, and dropping a reference shouldn't do anything. Regardless, I've edited in an implementation that uses a trait that both cleans up the syntax and solves this issue (if it even exists). – Aplet123 Jan 14 '21 at 02:38
  • 1
    I'd use [`Box::into_raw`](https://doc.rust-lang.org/stable/std/boxed/struct.Box.html#method.into_raw) instead of `Box::leak`. `leak` suggests that the box will actually be leaked, which is not the case here. – Francis Gagné Jan 14 '21 at 06:02
  • The `Guardable` trait is marked unsafe without a description of the rules that its implementers must follow to avoid undefined behavior. (Such a description is sometimes referred to as an "unsafe contract".) "Better (un)safe than sorry" is unhelpful because `unsafe` is not magic - the mere fact that a trait is marked unsafe won't change the code generated by the compiler. – user4815162342 Jan 14 '21 at 12:20
  • @user4815162342 I can't think of any cases where entirely safe code implementing the trait will lead to UB, so I can't write such a contract. If you can think of any cases, I'd be glad to update the post with them. – Aplet123 Jan 14 '21 at 12:25
  • Fair enough - then I think the trait shouldn't be unsafe. Also, please consider using `Box::into_raw` for obtaining the pointer while consuming the box - this is precisely the use case for that function. – user4815162342 Jan 14 '21 at 12:48
0

Disclaimer: I'm fairly new to Rust; this answer is based on limited experience and may be un-nuanced.

As a general principle of Rust program design — not specific to dependency injection — you should plan not to use references except for things that are one of:

  • temporary, i.e. confined to the life of some stack frame (or technically longer than that in the case of async functions, but you get the idea, I hope)
  • compile-time constants or lazily initialized singletons, i.e. &'static references

The reason is that Rust does not — without various trickery — support lifetimes that are not one of those two cases.

Any structures which are needed for longer durations than that should be designed to not contain non-'static references — and instead use owned values. In other words, let your DI be like

pub trait ResolveOwn<T: 'static> {
//                   ^^^^^^^^^^
    fn resolve(&self) -> T;
}

Don't actually add that lifetime constraint: it doesn't buy you anything, and might be inconvenient (for example, in a test that wants to inject things referring to the test — which will work fine since they live longer than the entire DI container — or if the application's actual main() has something to share, similarly). But plan as if it were there.


Given this constraint, how can you implement things that seem to want references?

  • In the simplest cases, just use owned values and don't worry about any extra cloning required unless it proves to be a performance issue.
  • Use Rc or Arc for reference-counted smart pointers that keep things alive as long as necessary.
  • If some T really requires references, but only into its own data, use a safe self-referential struct helper like ouroboros. (I believe this is similar to but more general than the suggestion in Aplet123's answer to this question.)

All of these strategies are independent of the DI container: they're ways to make a type satisfy the 'static lifetime bound ("contains no references except 'static ones").

Kevin Reid
  • 37,492
  • 13
  • 80
  • 108