1
  • I have a bunch of long immutable strings, which I would like to store in a HashSet.
  • I need a bunch of mappings with these strings as keys.
  • I would like to use references to these strings as keys in these mappings to avoid copying strings.

This is how I managed to eventually get to this status. The only concern is this extra copy I need to make at line 5.

let mut strings: HashSet<String> = HashSet::new();  // 1
let mut map: HashMap<&String, u8> = HashMap::new(); // 2
                                                    // 3
let s = "very long string".to_string();             // 4
strings.insert(s.clone());                          // 5
let s_ref = strings.get(&s).unwrap();               // 6
map.insert(s_ref, 5);                               // 7

playground link

To avoid this cloning I found two workarounds:

Is there any sensible way to remove this excessive cloning?

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Zozz
  • 1,875
  • 1
  • 14
  • 14
  • 4
    Keep in mind that the HashMap may reorganize its internals when it is modified, moving the items around and making the `&String` references dangling. –  Aug 29 '16 at 12:14
  • Why not just keep the `HashMap`? You could still check a string's membership with the HashMap – kennytm Aug 29 '16 at 12:15
  • kennytm, I need several mappings and so didn't want to copy keys for every one. I can have one hashmap with just strings and then others with references to them. this is the main problem. – Zozz Aug 29 '16 at 12:17
  • 3
    What about a `HashMap` where `Content` is a struct of values for your several mappings? – kennytm Aug 29 '16 at 12:36
  • That is one way to solve this but it has different semantics, for example I will not be able to get some mappings as immutable and others as mutable and separate which data gets accessed in which functions – Zozz Aug 29 '16 at 12:40
  • 2
    What about using `&str` instead of `String`? You could easily have a `HashSet<&str>` and `HashMap<&str, u8>`. – antoyo Aug 29 '16 at 12:54
  • @antoyo: same issue, you have to handle the lifetime of those `str` (especially if created dynamically). – Matthieu M. Aug 29 '16 at 14:32

1 Answers1

3

It seems to me that what you are looking for is String Interning. There is a library, string-cache, which was developed as part of the Servo project which may be of help.

In any case, the basics are simple:

  • a long-lived pool of String, which guarantees they will not be moving at all
  • a look-up system to avoid inserting duplicates in the pool

You can use a typed arena to store your String, and then store &str to those strings without copying them (they will live for as long as the arena lives). Use a HashSet<&str> on top to avoid duplicates, and you're set.

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722