I am interested in knowing the idiomatic/canonical way of making self-referential structures in Rust. The related question Why can't I store a value and a reference to that value in the same struct explains the problem, but try as I might, I couldn't figure out the answer in the existing question (although there were some useful hints).
I have come up with a solution, but I am unsure of how safe it is, or if it is the idiomatic way to solve this problem; if it isn't, I would very much like to know what the usual solution is.
I have an existing structure in my program that holds a reference to a sequence. Sequences hold information about chromosomes so they can be rather long, and copying them isn't a viable idea.
// My real Foo is more complicated than this and is an existing
// type I'd rather not have to rewrite if I can avoid it...
struct Foo<'a> {
x: &'a [usize],
// more here...
}
impl<'a> Foo<'a> {
pub fn new(x: &'a [usize]) -> Self {
Foo {
x, /* more here... */
}
}
}
I now need a new structure that reduces the sequence to something smaller and then builds a Foo
structure over the reduced string, and since someone has to own both reduced string and Foo
object, I would like to put both in a structure.
// My real Bar is slightly more complicated, but it boils down to having
// a vector it owns and a Foo over that vector.
struct Bar<'a> {
x: Vec<usize>,
y: Foo<'a>, // has a reference to &x
}
// This doesn't work because x is moved after y has borrowed it
impl<'a> Bar<'a> {
pub fn new() -> Self {
let x = vec![1, 2, 3];
let y = Foo::new(&x);
Bar { x, y }
}
}
Now, this doesn't work because the Foo
object in a Bar
refers into the Bar
object
and if the Bar
object moves, the reference will point into memory that is no longer occupied by the Bar
object
To avoid this problem, the x
element in Bar
must sit on the heap and not move around. (I think the data in a Vec
already sits happily on the heap, but that doesn't seem to help me here).
A pinned box should do the trick, I belive.
struct Bar<'a> {
x: Pin<Box<Vec<usize>>>,
y: Foo<'a>,
}
Now the structure looks like this
and when I move it, the references point to the same memory.
However, moving x
to the heap isn't enough for the type-checker. It still thinks that moving the pinned box will move what it points to.
If I implement Bar
's constructor like this:
impl<'a> Bar<'a> {
pub fn new() -> Self {
let v: Vec<usize> = vec![1, 2, 3];
let x = Box::pin(v);
let y = Foo::new(&x);
Bar { x, y }
}
}
I get the error
error[E0515]: cannot return value referencing local variable `x`
--> src/main.rs:22:9
|
21 | let y = Foo::new(&x);
| -- `x` is borrowed here
22 | Bar { x, y }
| ^^^^^^^^^^^^ returns a value referencing data owned by the current function
error[E0505]: cannot move out of `x` because it is borrowed
--> src/main.rs:22:15
|
17 | impl<'a> Bar<'a> {
| -- lifetime `'a` defined here
...
21 | let y = Foo::new(&x);
| -- borrow of `x` occurs here
22 | Bar { x, y }
| ------^-----
| | |
| | move out of `x` occurs here
| returning this value requires that `x` is borrowed for `'a`
Some errors have detailed explanations: E0505, E0515.
For more information about an error, try `rustc --explain E0505`.
Even though the object I take a reference of sits on the heap, and doesn't move, the checker still sees me borrowing from an object that moves, and that, of course, is a no-no.
Here, you might stop and notice that I am trying to make two pointers to the same object, so Rc
or Arc
is an obvious solution. And it is, but I would have to change the implementation of Foo
to have an Rc
member instead of a reference. While I do have control of the source code for Foo
, and I could update it and all the code that uses it, I am reluctant to make such a major change if I can avoid it. And I could have been in a situation where I am not in control of the Foo
, so I couldn't change its implementation, and I would love to know how I would solve that situation then.
The only solution I could get to work was to get a raw pointer to x
, so the type-checker doesn't see that I borrow it, and then connect x
and y
though that.
impl<'a> Bar<'a> {
pub fn new() -> Self {
let v: Vec<usize> = vec![1, 2, 3];
let x = Box::new(v);
let (x, y) = unsafe {
let ptr: *mut Vec<usize> = Box::into_raw(x);
let w: &Vec<usize> = ptr.as_ref().unwrap();
(Pin::new(Box::from_raw(ptr)), Foo::new(&w))
};
Bar { x, y }
}
}
Playground code here
What I don't know is if this is the right way to do it. It seems rather complicated, but perhaps it is the only way to make a structure like this in Rust? That some sort of unsafe
is needed to trick the compiler. So that is the first of my questions.
The second is, if this is safe to do? Of course it is unsafe
in the technical sense, but am I risking creating a reference to memory that might not be valid later? It is my impression that Pin
should guarantee that the object remains where it is supposed to sit, and that the lifetime of the Bar<'a>
and Foo<'a>
objects should ensure that the reference doesn't out-live the vector, but once I have gone unsafe
, could that promise be broken?
Update
The owning_ref
crate has functionality that looks like what I need. You can create owned objects that present their references as well.
There is an OwningRef
type that wraps an object and a reference, and it would be wonderful if you could have the slice in that and getting the reference wasn't seen as borrowing from the object, but obviously that isn't the case. Code such as this
use owning_ref::OwningRef;
struct Bar<'a> {
x: OwningRef<Vec<usize>, [usize]>,
y: Foo<'a>, // has a reference to &x
}
// This doesn't work because x is moved after y has borrowed it
impl<'a> Bar<'a> {
pub fn new() -> Self {
let v: Vec<usize> = vec![1, 2, 3];
let x = OwningRef::new(v);
let y = Foo::new(x.as_ref());
Bar { x, y }
}
}
you get the error
error[E0515]: cannot return value referencing local variable `x`
--> src/main.rs:22:9
|
21 | let y = Foo::new(x.as_ref());
| ---------- `x` is borrowed here
22 | Bar { x, y }
| ^^^^^^^^^^^^ returns a value referencing data owned by the current function
error[E0505]: cannot move out of `x` because it is borrowed
--> src/main.rs:22:15
|
17 | impl<'a> Bar<'a> {
| -- lifetime `'a` defined here
...
21 | let y = Foo::new(x.as_ref());
| ---------- borrow of `x` occurs here
22 | Bar { x, y }
| ------^-----
| | |
| | move out of `x` occurs here
| returning this value requires that `x` is borrowed for `'a`
Some errors have detailed explanations: E0505, E0515.
For more information about an error, try `rustc --explain E0505`.
error: could not compile `foo` due to 2 previous errors
The reason is the same as before: I borrow a reference to x
and then I move it.
There are different wrapper objects in the crate, and in various combinations they will let me get close to a solution and then snatch it away from me, because what I borrow I still cannot move later, e.g.:
use owning_ref::{BoxRef, OwningRef};
struct Bar<'a> {
x: OwningRef<Box<Vec<usize>>, Vec<usize>>,
y: Foo<'a>, // has a reference to &x
}
// This doesn't work because x is moved after y has borrowed it
impl<'a> Bar<'a> {
pub fn new() -> Self {
let v: Vec<usize> = vec![1, 2, 3];
let v = Box::new(v); // Vector on the heap
let x = BoxRef::new(v);
let y = Foo::new(x.as_ref());
Bar { x, y }
}
}
error[E0515]: cannot return value referencing local variable `x`
--> src/main.rs:23:9
|
22 | let y = Foo::new(x.as_ref());
| ---------- `x` is borrowed here
23 | Bar { x, y }
| ^^^^^^^^^^^^ returns a value referencing data owned by the current function
error[E0505]: cannot move out of `x` because it is borrowed
--> src/main.rs:23:15
|
17 | impl<'a> Bar<'a> {
| -- lifetime `'a` defined here
...
22 | let y = Foo::new(x.as_ref());
| ---------- borrow of `x` occurs here
23 | Bar { x, y }
| ------^-----
| | |
| | move out of `x` occurs here
| returning this value requires that `x` is borrowed for `'a`
Some errors have detailed explanations: E0505, E0515.
For more information about an error, try `rustc --explain E0505`.
I can get around this by going unsafe
and work with a pointer, of course, but then I am back to the solution I had with Pin
and pointer hacking. I strongly feel that there is a solution here, (especially because having a Box<Vec<...>>
and the corresponding Vec<...>
isn't adding much to the table so there must be more to the crate), but what it is is eluding me.