48

HashMap implements the get and insert methods which take a single immutable borrow and a single move of a value respectively.

I want a trait which is just like this but which takes two keys instead of one. It uses the map inside, but it's just a detail of implementation.

pub struct Table<A: Eq + Hash, B: Eq + Hash> {
    map: HashMap<(A, B), f64>,
}

impl<A: Eq + Hash, B: Eq + Hash> Memory<A, B> for Table<A, B> {
    fn get(&self, a: &A, b: &B) -> f64 {
        let key: &(A, B) = ??;
        *self.map.get(key).unwrap()
    }

    fn set(&mut self, a: A, b: B, v: f64) {
        self.map.insert((a, b), v);
    }
}
Mohammed Zayan
  • 859
  • 11
  • 20
lsunsi
  • 505
  • 4
  • 7
  • 1
    And what problem are you facing? – Peter Hall Aug 20 '17 at 21:01
  • 1
    I can't wrap my head around how do I take the two borrows and make a borrowed tuple. Or how to turn &A and &B into &(A, B), which is what I need to `.get` the hashmap – lsunsi Aug 20 '17 at 21:05
  • 1
    It will be easier for people to help you if you show what you tried and what errors or problems you encountered. See https://stackoverflow.com/help/mcve – Peter Hall Aug 21 '17 at 09:36
  • I don't think you can because the `(A,B)` tuple requires its elements to be next to each other in memory, whereas your `&A` and `&B` may be stored anywhere. You will need to use `Copy`able or `Clone`able keys so you can duplicate them as needed, or remove the references and move them inside your `get` function. – Jmb Aug 21 '17 at 09:50
  • The borrow checker is lurking behind what you are trying to do: While you have borrows on `A` and `B` individually, neither the caller nor the hashmap is not going to keep ownership of the tuple `(&A, &B)`. If you construct the tuple yourself and use a reference to that, nobody is holding ownership while the tuple is referenced by the map. Long story short: You can't. Construct map for `&(&A, &B)` directly and have the caller construct the tuples, instead of trying to use two individual keys. – user2722968 Aug 21 '17 at 10:08

3 Answers3

70

This is certainly possible. The signature of get is

fn get<Q: ?Sized>(&self, k: &Q) -> Option<&V> 
where
    K: Borrow<Q>,
    Q: Hash + Eq, 

The problem here is to implement a &Q type such that

  1. (A, B): Borrow<Q>
  2. Q implements Hash + Eq

To satisfy condition (1), we need to think of how to write

fn borrow(self: &(A, B)) -> &Q

The trick is that &Q does not need to be a simple pointer, it can be a trait object! The idea is to create a trait Q which will have two implementations:

impl Q for (A, B)
impl Q for (&A, &B)

The Borrow implementation will simply return self and we can construct a &dyn Q trait object from the two elements separately.


The full implementation is like this:

use std::borrow::Borrow;
use std::collections::HashMap;
use std::hash::{Hash, Hasher};

// See explanation (1).
trait KeyPair<A, B> {
    /// Obtains the first element of the pair.
    fn a(&self) -> &A;
    /// Obtains the second element of the pair.
    fn b(&self) -> &B;
}

// See explanation (2).
impl<'a, A, B> Borrow<dyn KeyPair<A, B> + 'a> for (A, B)
where
    A: Eq + Hash + 'a,
    B: Eq + Hash + 'a,
{
    fn borrow(&self) -> &(dyn KeyPair<A, B> + 'a) {
        self
    }
}

// See explanation (3).
impl<A: Hash, B: Hash> Hash for dyn KeyPair<A, B> + '_ {
    fn hash<H: Hasher>(&self, state: &mut H) {
        self.a().hash(state);
        self.b().hash(state);
    }
}

impl<A: Eq, B: Eq> PartialEq for dyn KeyPair<A, B> + '_ {
    fn eq(&self, other: &Self) -> bool {
        self.a() == other.a() && self.b() == other.b()
    }
}

impl<A: Eq, B: Eq> Eq for dyn KeyPair<A, B> + '_ {}

// OP's Table struct
pub struct Table<A: Eq + Hash, B: Eq + Hash> {
    map: HashMap<(A, B), f64>,
}

impl<A: Eq + Hash, B: Eq + Hash> Table<A, B> {
    fn new() -> Self {
        Table {
            map: HashMap::new(),
        }
    }

    fn get(&self, a: &A, b: &B) -> f64 {
        *self.map.get(&(a, b) as &dyn KeyPair<A, B>).unwrap()
    }

    fn set(&mut self, a: A, b: B, v: f64) {
        self.map.insert((a, b), v);
    }
}

// Boring stuff below.

impl<A, B> KeyPair<A, B> for (A, B) {
    fn a(&self) -> &A {
        &self.0
    }
    fn b(&self) -> &B {
        &self.1
    }
}
impl<A, B> KeyPair<A, B> for (&A, &B) {
    fn a(&self) -> &A {
        self.0
    }
    fn b(&self) -> &B {
        self.1
    }
}

//----------------------------------------------------------------

#[derive(Eq, PartialEq, Hash)]
struct A(&'static str);

#[derive(Eq, PartialEq, Hash)]
struct B(&'static str);

fn main() {
    let mut table = Table::new();
    table.set(A("abc"), B("def"), 4.0);
    table.set(A("123"), B("456"), 45.0);
    println!("{:?} == 45.0?", table.get(&A("123"), &B("456")));
    println!("{:?} == 4.0?", table.get(&A("abc"), &B("def")));
    // Should panic below.
    println!("{:?} == NaN?", table.get(&A("123"), &B("def")));
}

Explanation:

  1. The KeyPair trait takes the role of Q we mentioned above. We'd need to impl Eq + Hash for dyn KeyPair, but Eq and Hash are both not object safe. We add the a() and b() methods to help implementing them manually.

  2. Now we implement the Borrow trait from (A, B) to dyn KeyPair + 'a. Note the 'a — this is a subtle bit that is needed to make Table::get actually work. The arbitrary 'a allows us to say that an (A, B) can be borrowed to the trait object for any lifetime. If we don't specify the 'a, the unsized trait object will default to 'static, meaning the Borrow trait can only be applied when the implementation like (&A, &B) outlives 'static, which is certainly not the case.

  3. Finally, we implement Eq and Hash. Same reason as point 2, we implement for dyn KeyPair + '_ instead of dyn KeyPair (which means dyn KeyPair + 'static in this context). The '_ here is a syntax sugar meaning arbitrary lifetime.


Using trait objects will incur indirection cost when computing the hash and checking equality in get(). The cost can be eliminated if the optimizer is able to devirtualize that, but whether LLVM will do it is unknown.

An alternative is to store the map as HashMap<(Cow<A>, Cow<B>), f64>. Using this requires less "clever code", but there is now a memory cost to store the owned/borrowed flag as well as runtime cost in both get() and set().

Unless you fork the standard HashMap and add a method to look up an entry via Hash + Eq alone, there is no guaranteed-zero-cost solution.

SOFe
  • 7,867
  • 4
  • 33
  • 61
kennytm
  • 510,854
  • 105
  • 1,084
  • 1,005
  • 7
    I'm stunned on how complete and detailed this answer was. After reading it three times (that's on Rust, not you), I think I get it. Yeah, I thought I was missing something but it turns out the problem was kind of tricky. Many many thanks. – lsunsi Aug 21 '17 at 11:54
  • 2
    yeah, mad props. This is a great answer to the question, but also a great example for beginners of how to leverage and understand Rust's trait system – fostandy Jan 31 '20 at 01:19
  • A slightly confusing point is that line 14 (`impl Borrow for (A, B)`) actually covers both `(A, B)` and `(&A, &B)`. I was trying to port this solution to a similar problem where I am using `struct Foo([T; 2])` as the key type instead, and I needed to impl Borrow for both `Foo` and `(&T, &T)` separately. This was actually explained in (2), but I missed that at first and took a while to notice. – SOFe Aug 07 '21 at 10:15
  • Another note is that, as of Rust 1.56.0-nightly (2021-07-31), if you miss the `+ '_` behind the `Hash` or `Eq` impl, you get the error `explicit lifetime required in the type of b` on line 53, annotating `self.map.get` with `lifetime 'static required` without pointing to the correct place. This is quite confusing since we didn't explicitly mention `'static` anywhere in the code, but is implied by the `dyn`. – SOFe Aug 07 '21 at 10:35
3

A Memory trait that take two keys, set by value and get by reference:

trait Memory<A: Eq + Hash, B: Eq + Hash> {

    fn get(&self, a: &A, b: &B) -> Option<&f64>;

    fn set(&mut self, a: A, b: B, v: f64);
}

You can impl such trait using a Map of Maps:

pub struct Table<A: Eq + Hash, B: Eq + Hash> {
    table: HashMap<A, HashMap<B, f64>>,
}   

impl<A: Eq + Hash, B: Eq + Hash> Memory<A, B> for Table<A, B> {

    fn get(&self, a: &A, b: &B) -> Option<&f64> {
        self.table.get(a)?.get(b)
    }

    fn set(&mut self, a: A, b: B, v: f64) {
        let inner = self.table.entry(a).or_insert(HashMap::new());
        inner.insert(b, v);
    }
}

Please note that if the solution is somewhat elegant the memory footprint of an HashMap of HashMaps have to be considered when thousands of HashMap instances has to be managed.

Full example

attdona
  • 17,196
  • 7
  • 49
  • 60
0

I've found a nicer-ish answer to this question. We can create a wrapper hash map which borrows a tuple of the keys and returns them when it doesn't need them.

use std::cmp::Eq;
use std::hash::Hash;
use std::collections::HashMap;

struct MultiKeyHashMap<K1, K2, V>
where K1: Eq + Hash,
      K2: Eq + Hash {
    map: HashMap<(K1, K2), V>,
}

impl<K1, K2, V> MultiKeyHashMap<K1, K2, V>
where K1: Eq + Hash,
      K2: Eq + Hash {
    fn insert(&mut self, k1: K1, k2: K2, v: V) -> Option<V> {
        self.map.insert((k1, k2), v)
    }

    fn contains(&self, k1: K1, k2: K2) -> (K1, K2, bool) {
        let k = (k1, k2);
        let contained = self.map.contains_key(&k);
        (k.0, k.1, contained)
    }

    fn remove(&mut self, k1: K1, k2: K2) -> (K1, K2, Option<V>) {
        let k = (k1, k2);
        let removed = self.map.remove(&k);
        (k.0, k.1, removed)
    }

    fn with_capcity(cap: usize) -> Self {
        Self{map: HashMap::with_capacity(cap)}
    }
}

fn main() {
    let mut m = MultiKeyHashMap::<String, String, i32>::with_capcity(10);
    m.insert("hello".to_owned(), "world".to_owned(), 20);
    let (k1, k2, contained) = m.contains("hello".to_owned(), "world".to_owned());
    if !contained {
        println!("failed finding key in map");
    }

    let (k1, k2, val) = m.remove(k1, k2);
    match val {
        Some(i) => {
            if i != 20 {
                println!("failed properly storing/removing value from map");
            }
        },
        None => println!("failed removing value from map"),
    }
}

Syntactically not super elegant but 0 cost.

If you're comfortable with unsafe:

use std::cmp::Eq;
use std::hash::Hash;
use std::collections::HashMap;
use std::mem::forget;

struct MultiKeyHashMap<K1, K2, V>
where K1: Eq + Hash,
      K2: Eq + Hash {
    map: HashMap<(K1, K2), V>,
}

impl<K1, K2, V> MultiKeyHashMap<K1, K2, V>
where K1: Eq + Hash,
      K2: Eq + Hash {
    fn insert(&mut self, k1: K1, k2: K2, v: V) -> Option<V> {
        self.map.insert((k1, k2), v)
    }

    fn contains(&self, k1: &K1, k2: &K2) -> bool {
        unsafe {
            let k1_ptr = k1 as *const K1;
            let k2_ptr = k2 as *const K2;
            let key = (k1_ptr.read(), k2_ptr.read());
            let contained = self.map.contains_key(&key);
            forget(key);
            contained
        }
    }

    fn remove(&mut self, k1: &K1, k2: &K2) -> Option<V> {
        unsafe {
            let k1_ptr = k1 as *const K1;
            let k2_ptr = k2 as *const K2;
            let key = (k1_ptr.read(), k2_ptr.read());
            let removed = self.map.remove(&key);
            forget(key);
            removed
        }
    }

    fn with_capcity(cap: usize) -> Self {
        Self{map: HashMap::with_capacity(cap)}
    }
}

fn main() {
    let mut m = MultiKeyHashMap::<String, String, i32>::with_capcity(10);
    m.insert("hello".to_owned(), "world".to_owned(), 20);
    let (k1, k2) = ("hello".to_owned(), "world".to_owned());
    if !m.contains(&k1, &k2) {
        println!("failed finding key in map");
    }

    let val = m.remove(&k1, &k2);
    match val {
        Some(i) => {
            if i != 20 {
                println!("failed properly storing/removing value from map");
            }
        },
        None => println!("failed removing value from map"),
    }
}