3

I'd like to use a HashMap to cache an expensive computation that's dependent on other entries in the map. The entry pattern only provides a mutable reference to the matched value, but not to the rest of the HashMap. I'd really appreciate feedback on a better way to solve this (incorrect) toy example:

use std::collections::HashMap;
use std::collections::hash_map::Entry::{Occupied, Vacant};

fn compute(cache: &mut HashMap<u32, u32>, input: u32) -> u32 {
    match cache.entry(input) {
        Vacant(entry) => if input > 2 {
            // Trivial placeholder for an expensive computation.
            *entry.insert(compute(&mut cache, input - 1) +
                          compute(&mut cache, input - 2))
        } else {
            0
        },
        Occupied(entry) => *entry.get(),
    }
}

fn main() {
    let mut cache = HashMap::<u32, u32>::new();
    let foo = compute(&mut cache, 12);
    println!("{}", foo);
}

(playground)

The issue with the above snippet is that cache.entry borrows cache immutably, but I would like to update cache as well.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Kunal
  • 85
  • 7
  • 1
    How `entry` could be still valid if you change the hashmap ? – Stargateur Oct 17 '18 at 04:27
  • Stargateur -- since [entry](https://doc.rust-lang.org/std/collections/hash_map/enum.Entry.html) is just a view into a single item in the hash map, it would seem reasonable that it's more or less unrelated to other elements. But that's an interesting point, because mutating the hash map could cause collision handling logic to affect `entry` -- I haven't looked into how it's implemented. – Kunal Oct 17 '18 at 14:17
  • 1
    I don't know any implement of hashmap that will make possible what you are trying. – Stargateur Oct 17 '18 at 14:29
  • You're right -- there's no way for the borrow checker to deduce that they're mutually exclusive key mutations. Cheers! – Kunal Oct 18 '18 at 23:29

2 Answers2

8

hellow has shown how to get working code, but I want to dive a bit more into why your code does not compile.

The code you have proposed cannot be statically verified to be memory safe. It's entirely possible that your recursive calls attempt to access the same index. Check out this simplified code for one possibility:

use std::collections::{hash_map::Entry, HashMap};

fn compute(cache: &mut HashMap<u32, u32>) {
    if let Entry::Vacant(_entry) = cache.entry(42) {
        let _aliased_mutable_reference = cache.get_mut(&42).unwrap();
    }
}

This now has two mutable references pointing to the same value, violating the rules of references.

Additionally, what if the inner call used entry and it didn't exist?

use std::collections::{hash_map::Entry, HashMap};

fn compute(cache: &mut HashMap<u32, u32>) {
    if let Entry::Vacant(entry1) = cache.entry(42) {
        if let Entry::Vacant(entry2) = cache.entry(41) {
            entry2.insert(2);
            entry1.insert(1);
        }
    }
}

Now, when you insert the value into the map via entry2, the map may reallocate the underlying memory, invalidating the reference held by entry1, violating the other rule of references.

Rust has prevented you from introducing two possible types of memory unsafety into your program; just like it was designed to do.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
4

First things first: your example can be simplified with the .or_insert_with() method that takes a closure which returns the value to insert at that key.

It is not possible with the entry pattern to solve your problem, because you mutably borrow the cache first in your entry, and afterwards in your match (or closure). You can try it, if you use a RefCell (which just moves the borrowing from compiletime to runtime) it will throw a panic.

To actually solve your problem, you have to split getting and inserting the value, like this:

fn compute(cache: &mut HashMap<u32, u32>, input: u32) -> u32 {
    if let Some(entry) = cache.get(&input) {
        return *entry;
    }

    let res = if input > 2 {
        // Trivial placeholder for an expensive computation.
        compute(cache, input - 1) + compute(cache, input - 2)
    } else {
        0
    };
    cache.insert(input, res);
    res
}

(if you are on nightly and using ![feature(nll)] you can omit the return and use else on the if let branch to make it a little bit cleaner.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
hellow
  • 12,430
  • 7
  • 56
  • 79
  • Thank you! I appreciate the suggestion -- I ended up doing something similar (i.e. exit earlier), as well as split some of my mutable and immutable borrows by refactoring my container that had the hash map. – Kunal Oct 18 '18 at 23:27