17

I'm using a complex key for HashMap such that the key comprises two parts and one part is a String, and I can't figure out how to do lookups via the HashMap::get method without allocating a new String for each lookup.

Here's some code:

#[derive(Debug, Eq, Hash, PartialEq)]
struct Complex {
    n: i32,
    s: String,
}

impl Complex {
    fn new<S: Into<String>>(n: i32, s: S) -> Self {
        Complex { n: n, s: s.into() }
    }
}

fn main() {
    let mut m = std::collections::HashMap::<Complex, i32>::new();
    m.insert(Complex::new(42, "foo"), 123);

    // OK, but allocates temporary String
    assert_eq!(123, *m.get(&Complex::new(42, "foo")).unwrap());
}

The problem is with the final assertion. It passes, but it requires a temporary heap allocation because I cannot construct a Complex without constructing a String.

To eliminate temporary allocations like this, Rust provides the Borrow trait, which the HashMap::get method makes use of. I understand how to make Borrow work for simple keys. For example, the Rust Standard Library's PathBuf implements Borrow<Path> by making use of std::mem::transmute under the hood, but I can't figure out how to make it work for my Complex type:

#[derive(Debug)]
struct Borrowable {
    // ??? -- What goes here? Perhaps something like:
    n: i32,
    s1: &str, // ??? -- But what would the lifetime be? Or maybe:
    s2: str,  // ??? -- But how would I extend this to a complex type
              //        containing two or more strings?
}

impl Borrowable {
    fn new(n: i32, s: &str) -> &Self {
         // ??? -- What goes here? It must not allocate.
        unimplemented!();
    }
}

impl std::borrow::Borrow<Borrowable> for Complex {
    fn borrow(&self) -> &Borrowable {
        // ??? -- What goes here? How can I transmute a Complex into a
        //        &Borrowable?
        unimplemented!();
    }
}

This seems like a common use case, and I suspect I'm missing something important about Borrow, but I'm at a total loss.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Craig M. Brandenburg
  • 3,354
  • 5
  • 25
  • 37

3 Answers3

11

It sounds like you want this.

Cow will accept a &str or String.

use std::borrow::Cow;

#[derive(Debug, Eq, Hash, PartialEq)]
struct Complex<'a> {
    n: i32,
    s: Cow<'a, str>,
}

impl<'a> Complex<'a> {
    fn new<S: Into<Cow<'a, str>>>(n: i32, s: S) -> Self {
        Complex { n: n, s: s.into() }
    }
}

fn main() {
    let mut m = std::collections::HashMap::<Complex<'_>, i32>::new();
    m.insert(Complex::new(42, "foo"), 123);

    assert_eq!(123, *m.get(&Complex::new(42, "foo")).unwrap());
}

A comment about lifetime parameters:

If you don't like the lifetime parameter and you only need to work with &'static str or String then you can use Cow<'static, str> and remove the other lifetime parameters from the impl block and struct definition.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
A.B.
  • 15,364
  • 3
  • 61
  • 64
  • I'm gonna need a day or two to digest this. It's simple, yet it blows my mind. My main concern is that in my particular case `Complex` is exposed in my crate's API, so I need to make sure I can burden the type with a lifetime parameter without muddying my interface too much. My initial reaction is that `Cow`'s copy-on-write functionality is too heavy because I'm never doing any copying, but in a sense I almost am, because sometimes I'm using instances and sometimes I'm using references. After I finish digesting your answer, I'll let you know how it works. – Craig M. Brandenburg Apr 08 '16 at 00:02
  • You might be able to get rid of the lifetime parameters. See my edit. – A.B. Apr 08 '16 at 06:42
  • I cannot eliminate the lifetime parameter because I sometimes borrow from a non-static `str`. Nonetheless, I really like your idea because it would simplify code elsewhere because the new `Complex` covers both the owning and borrowing cases—no need for a unique type for each case. One thing that intrigues me is that the Rust Standard Library didn't go the route of using `Cow` in instead of having two types to cover the owning and borrowing cases for `Path`/`PathBuf` et al. The `Cow` idea seems more generalized because it works for complex types, too. Was this done for run-time efficiency? – Craig M. Brandenburg Apr 08 '16 at 19:29
7

You can follow the ideas described in How to implement HashMap with two keys?. Here's the "borrowed trait object" answer applied to your case:

Create a trait that we can use as a common Borrow target:

trait Key {
    fn to_key(&self) -> (i32, &str);
}

Implement the HashMap-required traits for the trait object:

use std::hash::{Hash, Hasher};

impl Hash for dyn Key + '_ {
    fn hash<H: Hasher>(&self, state: &mut H) {
        self.to_key().hash(state)
    }
}

impl PartialEq for dyn Key + '_ {
    fn eq(&self, other: &Self) -> bool {
        self.to_key() == other.to_key()
    }
}

impl Eq for dyn Key + '_ {}

Implement the trait for our primary type and any secondary lookup types:

impl Key for Complex {
    fn to_key(&self) -> (i32, &str) {
        (self.n, &self.s)
    }
}

impl<'a> Key for (i32, &'a str) {
    fn to_key(&self) -> (i32, &str) {
        (self.0, self.1)
    }
}

Implement Borrow for all the lookup types as returning our trait object:

impl<'a> Borrow<dyn Key + 'a> for Complex {
    fn borrow(&self) -> &(dyn Key + 'a) {
        self
    }
}

impl<'a> Borrow<dyn Key + 'a> for (i32, &'a str) {
    fn borrow(&self) -> &(dyn Key + 'a) {
        self
    }
}

Convert to the trait object at query time:

assert_eq!(Some(&123), m.get((42, "foo").borrow() as &dyn Key));

The complete code in the playground


One important "gotcha" is that all of your primary key and your secondary keys must hash in the same manner. This means that the same values need to go into the hash computation in the same order and amount.

You may wish to define Hash by hand to ensure that your primary and secondary keys hash the same!

Here's another example, this time with an enum:

#[derive(Debug, PartialEq, Eq)]
enum ConfigKey {
    Text(String),
    Binary(Vec<u8>),
}

We create a parallel enum that is composed of only references, so it's lightweight to create. It's important that we define the same variants and in the same order as the primary enum so they will hash the same. We rely on the fact that String and &str hash using the same algorithm, as do Vec<T> and &[T]:

impl ConfigKey {
    fn as_ref(&self) -> ConfigKeyRef<'_> {
        match self {
            ConfigKey::Text(t) => ConfigKeyRef::Text(t),
            ConfigKey::Binary(b) => ConfigKeyRef::Binary(b),
        }
    }
}

#[derive(Hash, PartialEq, Eq)]
enum ConfigKeyRef<'a> {
    Text(&'a str),
    Binary(&'a [u8]),
}

We use this new enum as our common underlying key type:

trait Key {
    fn to_key(&self) -> ConfigKeyRef<'_>;
}

And implement our trait for our primary and secondary keys:

impl Key for ConfigKey {
    fn to_key(&self) -> ConfigKeyRef<'_> {
        self.as_ref()
    }
}

impl<'a> Key for &'a str {
    fn to_key(&self) -> ConfigKeyRef<'_> {
        ConfigKeyRef::Text(self)
    }
}

The complete code in the playground

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
2

You can use the hashbrown crate directly (it is the underlying implementation of std's HashMap). Instead of Borrow, it provides a more flexible Equivalent trait, that you can use in your example:

#[derive(Debug, Hash)]
struct BorrowedComplex<'a> {
    n: i32,
    s: &'a str,
}

impl<'a> BorrowedComplex<'a> {
    fn new(n: i32, s: &'a str) -> Self {
        Self { n, s }
    }
}

impl hashbrown::Equivalent<Complex> for BorrowedComplex<'_> {
    fn equivalent(&self, key: &Complex) -> bool {
        self.n == key.n && self.s == key.s
    }
}

fn main() {
    let mut m = hashbrown::HashMap::<Complex, i32>::new();
    m.insert(Complex::new(42, "foo"), 123);

    assert_eq!(123, *m.get(&BorrowedComplex::new(42, "foo")).unwrap());
}
Chayim Friedman
  • 47,971
  • 5
  • 48
  • 77