1

I'm trying to implement a language lexer. I began trying to parse a string and return a vector of strings but I'm stumbling upon the borrow checker. After trying lots of things suggested here on Stack Overflow I'm really lost.

mod parser {
    pub struct Lexer {}

    impl Lexer {
        pub fn parse<'a>(&self, source: &'a str) -> Vec<&'a str> {
            let mut token = String::new();
            let mut tokens: Vec<&'a str> = vec![];

            for character in source.chars() {
                // ' ' is just a test example, a full lexer
                // will have more complex logic to split the string
                if character == ' ' {
                    tokens.push(&token); // what do I do here to create a &str with the correct lifetime?
                    token = String::new();
                    continue;
                }

                token.push(character);
            }

            tokens
        }
    }
}

fn main() {
    let lexer = parser::Lexer {};
    let tokens = lexer.parse("some dummy text");
}
error[E0597]: `token` does not live long enough
  --> src/main.rs:13:34
   |
13 |                     tokens.push(&token); // what do I do here to create a &str with the correct lifetime?
   |                                  ^^^^^ borrowed value does not live long enough
...
22 |         }
   |         - borrowed value only lives until here
   |
note: borrowed value must be valid for the lifetime 'a as defined on the method body at 5:22...
  --> src/main.rs:5:22
   |
5  |         pub fn parse<'a>(&self, source: &'a str) -> Vec<&'a str> {
   |                      ^^

I guess there's some magical method that will clone my string with the correct lifetime before I push it to my tokens vector, but I'm not sure about anything I've done actually.

I'm using Rust 1.28.0

Edit:

It's a bit different from the answer marked as duplicate because I'm returning a vector of strings, not simply a &str.

If I change my method to the following:

pub fn parse(&self, source: &str) -> Vec<&str> {
    vec!["some", "dummy", "text"]
}

The compiler doesn't complain about lifetimes and this behavior is not explained in the other answers. In this case, the ownership of the strings belongs to whom? Can I not replicate this ownership with dynamically created strings?

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Marco Luglio
  • 3,503
  • 1
  • 21
  • 27
  • Nope, you cannot return a reference to the `String` created inside the function. Since nothing owns it, it will be deallocated when the function exits and the reference would be invalid. Rust prevents that. You will have to return the `String` instead. You may also be interested in `Cow`. – Shepmaster Sep 24 '18 at 16:39
  • Of course, instead of creating new `String`s, you could take references to the passed-in `&str`, but presumably you are creating those new `String`s for a reason. – Shepmaster Sep 24 '18 at 16:41
  • 1
    It's not different because you return a `Vec`. If you return string **literals**, it is different. That would then be a duplicate of [How does the lifetime work on constant strings / string literals?](https://stackoverflow.com/q/31230585/155423). However, your original question has nothing to do with literals, so that doesn't seem like a useful comparison to make. – Shepmaster Sep 24 '18 at 17:34
  • *replicate this ownership with dynamically created strings* — that's what a `String` **is**: ownership of a dynamically created string. – Shepmaster Sep 24 '18 at 17:38
  • So in the case of returning !vec["something"] I'm return a vector with string literals, and in this case the "something" string has a static lifetime? – Marco Luglio Sep 24 '18 at 17:45
  • 1
    That is correct. `"foo"` is a `&'static str` and the `'static` lifetime can be mapped to any possible lifetime, thus it's memory safe to return from your example. – Shepmaster Sep 24 '18 at 17:47

0 Answers0