2

Let's say I'm trying to implement a StringPrefixCounter as a HashMap in Rust. The structure needs to have two operations:

  • put a string
  • count the number of strings with a given &str prefix.

My requirements are:

  • Use a HashMap rather than something totally different like a trie.
  • put strings are moved into the StringPrefixCounter.
  • put strings are not physically cloned in memory.
  • No unsafe code. As idiomatic as possible.

After spending half a day researching a lot of stuff, I finally came up with a solution that seems to work. Full code is below.

However, I'm still not happy because the special Prefix type has a ton of boilerplate that doesn't really do much beyond what &str already does, and wouldn't be necessary at all in other languages. Perhaps there is another way to do this using &str as the map key, referencing strings in separate storage? If so, my requirement is that StringPrefixCounter also owns and encapsulates this storage somehow. Static lifetime requirement is not acceptable.

Solution using Prefix type is below:

use std::borrow::Borrow;
use std::collections::HashMap;
use std::ops::Deref;
use std::rc::Rc;

fn main() {
    let mut counter = StringPrefixCounter::with_capacity(20);
    counter.put(String::from("abc"));
    counter.put(String::from("ab"));
    assert_eq!(counter.count("ab"), 2);
    assert_eq!(counter.count("abc"), 1);
    assert_eq!(counter.count("b"), 0);
}

struct Prefix {
    data : Rc<String>,
    len: usize,
}

impl Deref for Prefix {
    type Target = str;
    fn deref(&self) -> &Self::Target {
        &self.data[..self.len]
    }
}

impl Borrow<str> for Prefix {
    fn borrow(&self) -> &str {
       self.deref()
    }
}

impl PartialEq for Prefix {
    fn eq(&self, other: &Self) -> bool {
        self.deref() == other.deref()
    }
}

impl Eq for Prefix {}

use std::hash::{Hash, Hasher};

impl Hash for Prefix {
    fn hash<H: Hasher>(&self, state: &mut H) {
        self.deref().hash(state);
    }
}

struct StringPrefixCounter {
    map : HashMap<Prefix, u32>,
}

impl StringPrefixCounter {
    fn with_capacity(cap : usize) -> StringPrefixCounter {
        StringPrefixCounter{ map: HashMap::with_capacity(cap)}
    }

    fn put(&mut self, s : String) {
        let n = s.len();
        let s = Rc::new(s);
        let mut i = 1;
        while i <= n {
            let pre = Prefix{data: Rc::clone(&s), len: i};
            *self.map.entry(pre).or_insert(0) += 1;
            i += 1;
        }
    }

    fn count(&self, s : &str) -> u32 {
        *self.map.get(s).unwrap_or(&0)
    }
}
  • 1
    I don't think your suggested alternative is possible given the requirement that `StringPrefixCounter` owns the storage; see [Why can't I store a value and a reference to that value in the same struct?](https://stackoverflow.com/questions/32300132/why-cant-i-store-a-value-and-a-reference-to-that-value-in-the-same-struct) – Herohtar Aug 23 '20 at 06:35
  • It's not true that `Prefix` "wouldn't be necessary at all in other languages". Specifically, most other languages will either copy the data, or have a built-in string type that amounts to the same thing as `Prefix`. C++ is an exception where you have to write `Prefix` yourself, just as in Rust. (If by "wouldn't be necessary" you mean "I wouldn't have to write it myself" then you are right.) – trent Aug 23 '20 at 13:43
  • 1
    I think the requirement for the key type to be `&str` makes this a duplicate of [Why can't I store a value and a reference to that value in the same struct?](https://stackoverflow.com/questions/32300132/why-cant-i-store-a-value-and-a-reference-to-that-value-in-the-same-struct) If you can relax that, `Prefix` is basically a smaller version of [`OwningRef, str>`](https://kimundi.github.io/owning-ref-rs/owning_ref/struct.OwningRef.html) with an extra layer of indirection; using `owning_ref` might be a better option. – trent Aug 23 '20 at 14:29
  • Thanks! I tried out `OwningRef, str>` as you suggested and it works like a charm! I read the Rust book but OwningRef wasn't covered. – Alex Eustis Aug 23 '20 at 19:05

0 Answers0