3

I've very recently started studying Rust, and while working on a test program, I wrote this method:

pub fn add_transition(&mut self, start_state: u32, end_state: u32) -> Result<bool, std::io::Error> {
    let mut m: Vec<Page>;
    let pages: &mut Vec<Page> = match self.page_cache.get_mut(&start_state) {
        Some(p) => p,
        None    => {
            m = self.index.get_pages(start_state, &self.file)?;
            &mut m
        }
    };

    // omitted code that mutates pages 
    // ...

    Ok(true)
}

it does work as expected, but I'm not convinced about the m variable. If I remove it, the code looks more elegant:

pub fn add_transition(&mut self, start_state: u32, end_state: u32) -> Result<bool, std::io::Error> {
    let pages: &mut Vec<Page> = match self.page_cache.get_mut(&start_state) {
        Some(p) => p,
        None    => &mut self.index.get_pages(start_state, &self.file)?
    };

    // omitted code that mutates pages
    // ...

    Ok(true)
}

but I get:

error[E0716]: temporary value dropped while borrowed
  --> src\module1\mod.rs:28:29
   |
26 |           let pages: &mut Vec<Page> = match self.page_cache.get_mut(&start_state) {
   |  _____________________________________-
27 | |             Some(p) => p,
28 | |             None    => &mut self.index.get_pages(start_state, &self.file)?
   | |                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^-
   | |                             |                                            |
   | |                             |                                            temporary value is freed at the end of this statement
   | |                             creates a temporary which is freed while still in use
29 | |         };
   | |_________- borrow later used here
   |
   = note: consider using a `let` binding to create a longer lived value

I fully understand the error, which directed me to the working snippet, but I'm wondering if there's a more elegant and/or idiomatic way of writing this code. I am declaring m at the beginning of the function, only to prevent a temporary variable from being freed too early. Is there a way of telling the compiler that the lifetime of the return value of self.index.get_pages should be the whole add_transition function?

Further details:

  • Page is a relatively big struct, so I'd rather not implement the Copy trait nor I'd clone it.
  • page_cache is of type HashMap<u32, Vec<Page>>
  • self.index.get_pages is relatively slow and I'm using page_cache to cache results
  • The return type of self.index.get_pages is Result<Vec<Page>, std::io::Error>
didymus
  • 250
  • 2
  • 9
  • 1
    Looks fine to me. Declared variables and scopes are the normal way of defining lifetimes, when you need to do that. – trent Jan 21 '20 at 00:42

2 Answers2

4

This is normal, your 'cleaner' code basically comes down to do something as follows:

let y = {
    let x = 42;
    &x
};

Here it should be obvious that you cannot return a reference to x because x is dropped at the end of the block. Those rules don't change when working with temporary values: self.index.get_pages(start_state, &self.file)? creates a temporary value that is dropped at the end of the block (line 29) and thus you can't return a reference to it.

The workaround via m now moves that temporary into the m binding one block up which will live long enough for pages to work with it.

Now for alternatives, I guess page_cache is a HashMap? Then you could alternatively do something like let pages = self.page_cache.entry(start_state).or_insert_with(||self.index.get_pages(...))?;. The only problem with that approach is that get_pages returns a Result while the current cache stores Vec<Page> (the Ok branch only). You could adapt the cache to actually store Result instead, which I think is semantically also better since you want to cache the results of that function call, so why not do that for Err? But if you have a good reason to not cache Err, the approach you have should work just fine.

KillianDS
  • 16,936
  • 4
  • 61
  • 70
  • I understand that, I was wondering if there was a way of telling the compiler that the lifetime of the return value of ‘self.index.get_pages’ must be the whole add_transition function, without having to declare a variable that is never used. – didymus Jan 21 '20 at 09:18
  • @didymus Not as far as I know, you cannot force a temporary to live longer than its scope, except by moving ownership to a longer-lived context(binding), as you did. Even if you could hack something with explicit lifetimes, it's not the best or more idiomatic approach, there is a reason the compiler gave the binding hint. – KillianDS Jan 21 '20 at 13:23
0

Yours is probably the most efficient way, but in theory not necessary, and one can be more elegant.

Another way of doing it is to use a trait object in this case — have the variable be of the type dyn DerefMut<Vec<Page>>. This basically means that this variable can hold any type that implements the trait DerefMut<Vec<Page>>>, two types that do so are &mut Vec<Page> and Vec<Page>, in that case the variable can hold either of these, but the contents can only be referenced via DerefMut.

So the following code works as an illustration:

struct Foo {
    inner : Option<Vec<i32>>,
}

impl Foo {
    fn new () -> Self {
        Foo { inner : None }
    }
    fn init (&mut self) {
        self.inner = Some(Vec::new())
    }
    fn get_mut_ref (&mut self) -> Option<&mut Vec<i32>> {
        self.inner.as_mut()
    }
}

fn main () {
    let mut foo : Foo = Foo::new();
    let mut m   : Box<dyn AsMut<Vec<i32>>> = match foo.get_mut_ref() {
        Some(r) => Box::new(r),
        None    => Box::new(vec![1,2,3]),
    };
        m.as_mut().as_mut().push(4);
}

The key here is the type Box<dyn AsMut<Vec<i32>>; this means that it can be a box that holds any type, so long the type implement AsMut<Vec<i32>>, because it's boxed in we also need .as_mut().as_mut() to get the actual &mut <Vec<i32>> out of it.

Because different types can have different sizes; they also cannot be allocated on the stack, so they must be behind some pointer, a Box is typically chosen therefore, and in this case necessary, a normal pointer that is sans ownership of it's pointee will face similar problems to those you face.

One might argue that this code is more elegant, but yours is certainly more efficient and does not require further heap allocation.

Zorf
  • 6,334
  • 2
  • 29
  • 24