12

Suppose I have several structures like in the following example, and in the next() method I need to pull the next event using a user-provided buffer, but if this event is a comment, and ignore comments flag is set to true, I need to pull the next event again:

struct Parser {
    ignore_comments: bool,
}

enum XmlEvent<'buf> {
    Comment(&'buf str),
    Other(&'buf str),
}

impl Parser {
    fn next<'buf>(&mut self, buffer: &'buf mut String) -> XmlEvent<'buf> {
        let result = loop {
            buffer.clear();

            let temp_event = self.parse_outside_tag(buffer);

            match temp_event {
                XmlEvent::Comment(_) if self.ignore_comments => {}
                _ => break temp_event,
            }
        };
        result
    }

    fn parse_outside_tag<'buf>(&mut self, _buffer: &'buf mut String) -> XmlEvent<'buf> {
        unimplemented!()
    }
}

This code, however, gives a double borrow error, even when I have #![feature(nll)] enabled:

error[E0499]: cannot borrow `*buffer` as mutable more than once at a time
  --> src/main.rs:14:13
   |
14 |             buffer.clear();
   |             ^^^^^^ second mutable borrow occurs here
15 |             
16 |             let temp_event = self.parse_outside_tag(buffer);
   |                                                     ------ first mutable borrow occurs here
   |
note: borrowed value must be valid for the lifetime 'buf as defined on the method body at 12:5...
  --> src/main.rs:12:5
   |
12 |     fn next<'buf>(&mut self, buffer: &'buf mut String) -> XmlEvent<'buf> {
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error[E0499]: cannot borrow `*buffer` as mutable more than once at a time
  --> src/main.rs:16:53
   |
16 |             let temp_event = self.parse_outside_tag(buffer);
   |                                                     ^^^^^^ mutable borrow starts here in previous iteration of loop
   |
note: borrowed value must be valid for the lifetime 'buf as defined on the method body at 12:5...
  --> src/main.rs:12:5
   |
12 |     fn next<'buf>(&mut self, buffer: &'buf mut String) -> XmlEvent<'buf> {
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: aborting due to 2 previous errors

I can (approximately at least) understand why an error could happen here with the NLL feature turned off, but I don't understand why it happens with NLL.

Anyway, my end goal is to implement this without flags, so I also tried doing this (it is recursive, which is really unfortunate, but all non-recursive versions I came up with could not possibly work without NLL):

fn next<'buf>(&mut self, buffer: &'buf mut String) -> XmlEvent<'buf> {
    buffer.clear();

    {
        let temp_event = self.parse_outside_tag(buffer);

        match temp_event {
            XmlEvent::Comment(_) if self.ignore_comments => {}
            _ => return temp_event,
        }
    }

    self.next(buffer)
}

Here I tried to confine the borrow inside a lexical block, and nothing from this block leaks to the outside. However, I'm still getting an error:

error[E0499]: cannot borrow `*buffer` as mutable more than once at a time
  --> src/main.rs:23:19
   |
15 |             let temp_event = self.parse_outside_tag(buffer);
   |                                                     ------ first mutable borrow occurs here
...
23 |         self.next(buffer)
   |                   ^^^^^^ second mutable borrow occurs here
24 |     }
   |     - first borrow ends here

error: aborting due to previous error

And again, NLL does not fix it.

It has been a long time since I encountered a borrow checking error which I don't understand, so I'm hoping it is actually something simple which I'm overlooking for some reason :)

I really suspect that the root cause is somehow connected with the explicit 'buf lifetime (in particular, errors with the NLL flag turned on have these notes about it), but I can't understand what exactly is wrong here.

Vladimir Matveev
  • 120,085
  • 34
  • 287
  • 296
  • Which `rustc` version do you use? Does `temp_event` is/contain a reference? I'm asking for the `rustc` version, since [match_default_bindings](https://github.com/rust-lang/rust/pull/49394) went stable recently and this might end in a reference here, while you didn't expected it to be a reference. – Tim Diekmann May 24 '18 at 23:38
  • @Tim rustc version is 2018-05-20 nightly or the latest stable. And yes, temp_event does contain a reference to the buffer. I thought that it should be visible from the method signature but apparently it was not :( I'm not sure how that particular feature is related here, because the match does not capture anything, it just verifies the structure of the data. – Vladimir Matveev May 25 '18 at 07:13
  • You have a design flow here, as `XmlEvent` take the ownership of `buffer` when you choose to return `temp_event` you bound it's lifetime to the lifetime of the function. I think there is no way to solve this issue. You could do the check without construct `XmlEvent`. – Stargateur May 25 '18 at 09:35

2 Answers2

10

This is a limitation of the current implementation of non-lexical lifetimes This can be shown with this reduced case:

fn next<'buf>(buffer: &'buf mut String) -> &'buf str {
    loop {
        let event = parse(buffer);

        if true {
            return event;
        }
    }
}

fn parse<'buf>(_buffer: &'buf mut String) -> &'buf str {
    unimplemented!()
}

fn main() {}

This limitation prevents NLL case #3: conditional control flow across functions

In compiler developer terms, the current implementation of non-lexical lifetimes is "location insensitive". Location sensitivity was originally available but it was disabled in the name of performance.

I asked Niko Matsakis about this code:

In the context of your example: the value event only has to have the lifetime 'buf conditionally — at the return point which may or may not execute. But when we are "location insensitive", we just track the lifetime that event must have anywhere, without considering where that lifetime must hold. In this case, that means we make it hold everywhere, which is why you get a compilation failure.

One subtle thing is that the current analysis is location sensitive in one respect — where the borrow takes place. The length of the borrow is not.

The good news is that adding this concept of location sensitivity back is seen as an enhancement to the implementation of non-lexical lifetimes. The bad news:

That may or may not be before the [Rust 2018] edition.

(Note: it did not make it into the initial release of Rust 2018)

This hinges on a (even newer!) underlying implementation of non-lexical lifetimes that improves the performance. You can opt-in to this half-implemented version using -Z polonius:

rustc +nightly -Zpolonius --edition=2018 example.rs
RUSTFLAGS="-Zpolonius" cargo +nightly build

Because this is across functions, you can sometimes work around this by inlining the function.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
  • Thanks for checking this with Niko! I guess for now I'll have to stick with `unsafe` to overcome this thing. I think I know how to overcome it without resorting to `unsafe`, but it would require changing internal method types in way I don't like :( – Vladimir Matveev May 29 '18 at 12:14
4

I posted a question (A borrow checker problem with a loop and non-lexical lifetimes) that was answered by the answer of this question.

I'll document here a workaround that also answers the question. Let's say you have code like this, that only compiles with Polonius:

struct Inner;

enum State<'a> {
    One,
    Two(&'a ()),
}

fn get<'s>(_inner: &'s mut Inner) -> State<'s> {
    unimplemented!()
}

struct Outer {
    inner: Inner,
}

impl Outer {
    pub fn read<'s>(&'s mut self) -> &'s () {
        loop {
            match get(&mut self.inner) {
                State::One => (), // In this case nothing happens, the borrow should end and the loop should continue
                State::Two(a) => return a, // self.inner ought to be borrowed for 's, that's just to be expected
            }
        }
    }
}

As it was said in the another answer:

One subtle thing is that the current analysis is location sensitive in one respect — where the borrow takes place. The length of the borrow is not.

Indeed, borrowing the needed reference again inside the conditional branch makes it compile! Of course, this makes the assumption that get is referentially transparent, so your mileage may vary, but borrowing again seems like an easy enough workaround.

struct Inner;

enum State<'a> {
    One,
    Two(&'a ()),
}

fn get<'s>(_inner: &'s mut Inner) -> State<'s> {
    unimplemented!()
}

struct Outer {
    inner: Inner,
}

impl Outer {
    pub fn read<'s>(&'s mut self) -> &'s () {
        loop {
            match get(&mut self.inner) {
                State::One => (), // In this case nothing happens, the borrow should end and the loop should continue
                State::Two(a) => {
                    return match get(&mut self.inner) { // Borrowing again compiles!
                        State::Two(a) => a,
                        _ => unreachable!(),
                    }
                }, // self.inner ought to be borrowed for 's, that's just to be expected
            }
        }
    }
}

fn main() {
    println!("Hello, world!");
}
GolDDranks
  • 3,272
  • 4
  • 22
  • 30