0

I'm writing a library that should read from something implementing the BufRead trait; a network data stream, standard input, etc. The first function is supposed to read a data unit from that reader and return a populated struct filled mostly with &'a str values parsed from a frame from the wire.

Here is a minimal version:

mod mymod {
    use std::io::prelude::*;
    use std::io;

    pub fn parse_frame<'a, T>(mut reader: T)
    where
        T: BufRead,
    {
        for line in reader.by_ref().lines() {
            let line = line.expect("reading header line");
            if line.len() == 0 {
                // got empty line; done with header
                break;
            }
            // split line
            let splitted = line.splitn(2, ':');
            let line_parts: Vec<&'a str> = splitted.collect();

            println!("{} has value {}", line_parts[0], line_parts[1]);
        }
        // more reads down here, therefore the reader.by_ref() above
        // (otherwise: use of moved value).
    }
}

use std::io;

fn main() {
    let stdin = io::stdin();
    let locked = stdin.lock();
    mymod::parse_frame(locked);
}

An error shows up which I cannot fix after trying different solutions:

error: `line` does not live long enough
  --> src/main.rs:16:28
   |
16 |             let splitted = line.splitn(2, ':');
   |                            ^^^^ does not live long enough
...
20 |         }
   |         - borrowed value only lives until here
   |
note: borrowed value must be valid for the lifetime 'a as defined on the body at 8:4...
  --> src/main.rs:8:5
   |
8  | /     {
9  | |         for line in reader.by_ref().lines() {
10 | |             let line = line.expect("reading header line");
11 | |             if line.len() == 0 {
...  |
22 | |         // (otherwise: use of moved value).
23 | |     }
   | |_____^

The lifetime 'a is defined on a struct and implementation of a data keeper structure because the &str requires an explicit lifetime. These code parts were removed as part of the minimal example.

BufReader has a lines() method which returns Result<String, Err>. I handle errors using expect or match and thus unpack the Result so that the program now has the bare String. This will then be done multiple times to populate a data structure.

Many answers say that the unwrap result needs to be bound to a variable otherwise it gets lost because it is a temporary value. But I already saved the unpacked Result value in the variable line and I still get the error.

  1. How to fix this error - could not get it working after hours trying.

  2. Does it make sense to do all these lifetime declarations just for &str in a data keeper struct? This will be mostly a readonly data structure, at most replacing whole field values. String could also be used, but have found articles saying that String has lower performance than &str - and this frame parser function will be called many times and is performance-critical.

Similar questions exist on Stack Overflow, but none quite answers the situation here.

For completeness and better understanding, following is an excerpt from complete source code as to why lifetime question came up:

Data structure declaration:

// tuple
pub struct Header<'a>(pub &'a str, pub &'a str);

pub struct Frame<'a> {
    pub frameType: String,
    pub bodyType: &'a str,
    pub port: &'a str,
    pub headers: Vec<Header<'a>>,
    pub body: Vec<u8>,
}

impl<'a> Frame<'a> {
    pub fn marshal(&'a self) {
        //TODO
        println!("marshal!");
    }
}

Complete function definition:

pub fn parse_frame<'a, T>(mut reader: T) -> Result<Frame<'a>, io::Error> where T: BufRead {
Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
ERnsTL
  • 25
  • 4
  • 1
    `'a` lifetime don't match with `line` lifetime. https://play.rust-lang.org/?gist=3f510b31e0009af23d637c04660d4586&version=stable&backtrace=0. – Stargateur Jun 20 '17 at 00:40
  • Oh my... so simple change. I was assuming that all ```&str``` and ```Vec<&str>``` need to be of same lifetime and thus made it ```Vec<&'a str>```. Did not expect *this* to be the cause of trouble. You do obviously have a trained eye - thank you so much, @Stargateur! – ERnsTL Jun 20 '17 at 00:57
  • 2
    Not only is `'a` left unrestricted, there's nothing owning the test data fetched from the reader. You might want to use `String` rather than non-owning strings. – E_net4 Jun 20 '17 at 00:59

2 Answers2

7

Your problem can be reduced to this:

fn foo<'a>() {
    let thing = String::from("a b");
    let parts: Vec<&'a str> = thing.split(" ").collect();
}

You create a String inside your function, then declare that references to that string are guaranteed to live for the lifetime 'a. Unfortunately, the lifetime 'a isn't under your control — the caller of the function gets to pick what the lifetime is. That's how generic parameters work!

What would happen if the caller of the function specified the 'static lifetime? How would it be possible for your code, which allocates a value at runtime, to guarantee that the value lives longer than even the main function? It's not possible, which is why the compiler has reported an error.

Once you've gained a bit more experience, the function signature fn foo<'a>() will jump out at you like a red alert — there's a generic parameter that isn't used. That's most likely going to mean bad news.


return a populated struct filled mostly with &'a str

You cannot possibly do this with the current organization of your code. References have to point to something. You are not providing anywhere for the pointed-at values to live. You cannot return an allocated String as a string slice.

Before you jump to it, no you cannot store a value and a reference to that value in the same struct.

Instead, you need to split the code that creates the String and that which parses a &str and returns more &str references. That's how all the existing zero-copy parsers work. You could look at those for inspiration.

String has lower performance than &str

No, it really doesn't. Creating lots of extraneous Strings is a bad idea, sure, just like allocating too much is a bad idea in any language.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
  • Greetings @Shepmaster, thanks for explaining! Will take a look at parsers written in Rust. I am not fully sure though about what is meant when you write *references have to point to something* or *have to be owned by somebody*: To my understanding, a ```&str``` points to a character string = region in memory and it can then be attached to a data ```struct```ure and that can be returned to the caller, who owns/receives the returned result data structure. That is at least how my mental model is, coming from other PLs. Rust seems to require a different way of thinking. Greetings – ERnsTL Jun 22 '17 at 19:20
  • @ERnsTL there's no drastic difference between Rust and other languages in what happens. The difference is that Rust ensures your references are always valid (unlike C or C++) and does so without using a garbage collector (unlike Java, C#, Ruby, etc.). In a language like Java, every `Object` is actually a type of reference and the language / VM ensures that the thing that is referred to will be around as long as needed. Your code is trying to have a reference without the referred-to-thing; this is akin to having an address without a house present; visiting the address will lead to Bad Things. – Shepmaster Jun 22 '17 at 20:01
  • Thanks again for explaining - it's clearly not enough to understand a bit of Rust syntax; knowing the ownership system is a must, it has consequences all over. The *Rust book* in chapter *Understanding Ownership* (references, borrowing, slices) and *Common Collections* (Strings and slices) explain why the code above does not work. In chapter 5 (2. Ed. draft) *Ownership of struct data* has exactly an example like above with the solution in chapter 10: *Generic Types, Traits and Lifetimes*. Studying zero-copy parsers like Nom and tutorials (eg. *Rusty Buffers*) also helps greatly. – ERnsTL Jun 24 '17 at 20:05
0

Maybe the following program gives clues for others who also also having their first problems with lifetimes:

fn main() {
    // using String und &str Slice
    let my_str: String = "fire".to_owned();
    let returned_str: MyStruct = my_func_str(&my_str);
    println!("Received return value: {ret}", ret = returned_str.version);

    // using Vec<u8> und &[u8] Slice
    let my_vec: Vec<u8> = "fire".to_owned().into_bytes();
    let returned_u8: MyStruct2 = my_func_vec(&my_vec);
    println!("Received return value: {ret:?}", ret = returned_u8.version);
}


// using String -> str
fn my_func_str<'a>(some_str: &'a str) -> MyStruct<'a> {
    MyStruct {
        version: &some_str[0..2],
    }
}

struct MyStruct<'a> {
    version: &'a str,
}


// using Vec<u8> -> & [u8]
fn my_func_vec<'a>(some_vec: &'a Vec<u8>) -> MyStruct2<'a> {
    MyStruct2 {
        version: &some_vec[0..2],
    }
}

struct MyStruct2<'a> {
    version: &'a [u8],
}
Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
ERnsTL
  • 25
  • 4