1

Writing a simple interpreter has lead me to this battle with the borrow checker.

#[derive(Clone, Debug)]
struct Context<'a> {
    display_name: &'a str,
    parent: Option<Box<Context<'a>>>,
    parent_entry_pos: Position<'a>,
}

// --snip--

#[derive(Copy, Clone, Debug)]
pub enum BASICVal<'a> {
    Float(f64, Position<'a>, Position<'a>, &'a Context<'a>),
    Int(i64, Position<'a>, Position<'a>, &'a Context<'a>),
    Nothing(Position<'a>, Position<'a>, &'a Context<'a>),
}

// --snip--

pub fn run<'a>(text: &'a String, filename: &'a String) -> Result<(Context<'a>, BASICVal<'a>), BASICError<'a>> {
    // generate tokens
    let mut lexer = Lexer::new(text, filename);
    let tokens = lexer.make_tokens()?;

    // parse program to AST
    let mut parser = Parser::new(tokens);
    let ast = parser.parse();

    // run the program
    let context: Context<'static> = Context {
        display_name: "<program>",
        parent: None,
        parent_entry_pos: Position::default(),
    };
    Ok((context, interpreter_visit(&ast?, &context)?))
}

The error is "cannot return value referencing local variable `context`" and (secondary) the "borrow of moved value: `context`":

error[E0515]: cannot return value referencing local variable `context`
   --> src\basic.rs:732:2
    |
732 |     Ok((context, interpreter_visit(&ast?, &context)?))
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^--------^^^^
    |     |                                     |
    |     |                                     `context` is borrowed here
    |     returns a value referencing data owned by the current function

error[E0382]: borrow of moved value: `context`
   --> src\basic.rs:732:40
    |
727 |     let context: Context<'static> = Context {
    |         ------- move occurs because `context` has type `basic::Context<'_>`, which does not implement the `Copy` trait
...
732 |     Ok((context, interpreter_visit(&ast?, &context)?))
    |         ------- value moved here          ^^^^^^^^ value borrowed here after move

As far as I understand it: The context references several lifetime-dependent structs. The values of these structs are static in this case, as I can see by explicitly setting the lifetime parameter to 'static and the compiler not complaining. The interpreter_visit function needs to borrow the context because it gets passed to several independent functions, including itself recursively. In addition, the interpreter_visit returns BASICVals that reference the context themselves. For this reason, the context needs to outlive the run return. I try to achieve that by passing the context itself as part of the return value, thereby giving the caller control over its life. But now, I move the context to the return value before actually using it? This makes no sense. I should be able to reference one part of the return value in another part of the return value because both values make it out of the function "alive". I have tried:

  • boxing the context, thereby forcing it off the stack onto the heap, but that seems to only complicate things.
  • switching the order of the tuple, but that doesn't help.
  • storing interpreter_visit's result in an intermediate variable, which as expected doesn't help.
  • cloning the interpreter_visit result or the context itself

The issue may lie with the result and the error. The error doesn't reference a context but giving it a separate lifetime in interpreter_visit breaks the entire careful balance I have been able to achieve until now.

  • 2
    I moved to a new street, but I brought my mailbox with me. Why did I stop getting mail? – trent Mar 04 '21 at 13:00
  • @trentcl please put that into more detail – kleines Filmröllchen Mar 04 '21 at 13:13
  • 2
    The question needs an [mre], but is probably a duplicate of [Why can't I store a value and a reference to that value in the same struct?](https://stackoverflow.com/q/32300132/3650362) (you're using a tuple instead of a struct, but the semantics are identical). Other questions linked there, like [1](https://stackoverflow.com/q/62096365/3650362) and [2](https://stackoverflow.com/q/53947975/3650362) might have other answers that help clear things up. – trent Mar 04 '21 at 13:46
  • tl;dr - lifetimes are associated with stack slots, and `context` is a stack slot in `run`, not its caller, so it can't have the caller-provided lifetime `'a`. There are cases (as the linked question's answers also mention) where Rust's stack-based lifetime analysis is overly conservative; your question doesn't contain enough detail to say for sure whether that's the case here, but if you're *really* sure that what you're doing is provably safe, it might be that `interpreter_visit` has an overly conservative signature. `&'a Context<'a>` is also highly suspicious. – trent Mar 04 '21 at 13:52
  • Also read [Why is it discouraged to accept a reference to a String (&String), Vec (&Vec), or Box (&Box) as a function argument?](https://stackoverflow.com/q/40006219/3650362) – trent Mar 04 '21 at 14:02
  • A MRE is difficult, I have tried but they all ran into other unrelated errors that I wasn't getting in my code. The struct answer was helpful and flew under my radar for some reason. It actually seems like I misunderstood Rust's ability to reshuffle references (or it doesn't at all, whatever). The question remains why Box cannot solve this, as Box states that it heap-allocates, which makes the data lifetime independent of stack frames at least in every language I know of – kleines Filmröllchen Mar 04 '21 at 14:24
  • Putting `context` in a `Box` just means the lifetime is associated with the stack slot the `Box` itself is in. All named lifetimes are associated with a value in a stack slot. The *actual value* may live longer than that, but then you're getting into runtime properties that can't be checked by static analysis, so you need to start dipping into safe abstractions that are checked at runtime (like `Arc`) or `unsafe` constructs with no checking at all (like `*const T`). – trent Mar 04 '21 at 14:36
  • This was finally solveable by giving the interpreter_visit a raw pointer, but passing the context along so it gets destroyed nicely and doesn't die before the value. No memory leaks yet... thanks for all the help – kleines Filmröllchen Mar 04 '21 at 15:20

1 Answers1

0

Answering this so that people don't have to read the comment thread.

This is a problem apparently not solvable by Rust's borrow checker. The borrow checker cannot understand that a Box of context will live on the heap and therefore last longer than the function return, therefore being "legally" referencable by the return value of interpreter_visit which itself escapes the function. The solution in this case is to circumvent borrow checking via unsafe, namely a raw pointer. Like this:

let context = Box::new(Context {
    display_name: "<program>",
    parent: None,
    parent_entry_pos: Position::default(),
});
// Obtain a pointer to a location on the heap
let context_ptr: *const Context = &*context;
// outsmart the borrow checker
let result = interpreter_visit(&ast?, unsafe { &*context_ptr })?;
// The original box is passed back, so it is destroyed safely.
// Because the result lives as long as the context as required by the lifetime,
// we cannot get a pointer to uninitialized memory through the value and its context.
Ok((context, result))

I store a raw pointer to the context in context_ptr. The borrowed value passed to interpreter_visit is then piped through a (completely memory-safe) raw pointer dereference and borrow. This will (for some reason, only the Rust gods know) disable the borrow check, so the context data given to interpreter_visit is considered to have a legal lifetime. As I am however still passing back the very safe Box around the context data, I can avoid creating memory leaks by leaving the context with no owner. It might be possible now to pass around the interpreter_visit return value with having the context destroyed, but because both values are printed and discarded immediately, I see no issues arising from this in the future.

If you have a deeper understanding of Rust's borrow checker and would consider this a fixable edge case that doesn't have more "safe" solutions I couldn't come up with, please do comment and I will report this to the Rust team. I'm however not that certain especially because my experience with and knowledge of Rust is limited.