1

I have a struct that contains a unique id and uses that id for its hash:

use std::borrow::Borrow;
use std::collections::HashSet;
use std::hash::{Hash, Hasher};

type Id = u32;

#[derive(Debug, Eq)]
struct Foo {
    id: Id,
    other_data: u32,
}

impl PartialEq for Foo {
    fn eq(&self, other: &Foo) -> bool {
        self.id == other.id
    }
}

impl Hash for Foo {
    fn hash<H: Hasher>(&self, state: &mut H) {
        self.id.hash(state);
    }
}

impl Borrow<Id> for Foo {
    fn borrow(&self) -> &Id {
        &self.id
    }
}

I understand that I cannot modify the value of Foo::id once I've put it into the HashSet because that would change the hash. However, I would like to modify Foo::other_data. I know I could remove it from the HashSet, modify it, and insert it again, but a method like get_mut() would be so much cleaner. Is there a way to accomplish something like this:

fn main() {
    let mut baz = HashSet::new();
    baz.insert(Foo {
        id: 1,
        other_data: 2,
    });

    if let Some(x) = baz.get_mut(&1) {
        *x = 3;
    }
}

Is this an anti-pattern; should I be using a HashMap instead?

Related to this question.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
HiDefender
  • 2,088
  • 2
  • 14
  • 31
  • `Or is this an anti-pattern and I should be using HashMap instead?` I think you hit the nail on the head right there. You want a data structure with an immutable hash part and a mutable non-hash part, which is exactly what `HashMap` gives you. – Silvio Mayolo Apr 01 '19 at 23:48
  • @SilvioMayolo My structs contain their unique ids. With a HashMap how do you ensure that the key corresponds to id value the struct contains? – HiDefender Apr 02 '19 at 00:05
  • 1
    I would recommend encapsulating it in a hashset-like interface. You can use a hashmap internally and encapsulate the guarantees about the ID. – Silvio Mayolo Apr 02 '19 at 00:33

2 Answers2

10

This is not possible with your current data structure.

HashSet deliberately does not provide methods to mutate values. As you have alluded to, mutating a value in a HashSet (or the key in a HashMap) will invalidate the hash in the majority of cases. The API encourages proper usage and even makes mention of this:

It is a logic error for an item to be modified in such a way that the item's hash, as determined by the Hash trait, or its equality, as determined by the Eq trait, changes while it is in the set. This is normally only possible through Cell, RefCell, global state, I/O, or unsafe code.

This alludes to one way that you can solve your problem, by using interior mutability:

use std::cell::Cell;

#[derive(Debug, Eq)]
struct Foo {
    id: Id,
    other_data: Cell<u32>,
}
fn main() {
    let mut baz = HashSet::new();
    baz.insert(Foo {
        id: 1,
        other_data: Cell::new(2),
    });

    if let Some(x) = baz.get(&1) {
        x.other_data.set(3);
    }
}

This is a reasonable thing to do, but I wouldn't be thrilled about doing it. Instead, I would allow my type to be decomposed into a key and a value and store it in a HashMap, as mentioned. Something like


impl Foo {
    // or insert_into_hashmap(self, &mut HashMap<Id, u32>)
    fn into_key_value(self) -> (Id, u32) {
        (self.id, self.other_data)
    }

    // Maybe a
    //
    // fn from_key_value(&'a Id, &'a u32) -> Self
    // or
    // fn from_hashmap(Id, &HashMap<Id, u32>) -> Self
}

// Maybe a
//
// struct FooRef<'a> { (or FooRefMut?) 
//     id: &'a Id,
//     other_data: &'a u32,
// }
//
// With a
// fn from_key_value(&'a Id, &'a u32) -> Self
// or
// fn from_hashmap(Id, &HashMap<Id, u32>) -> Self

fn main() {
    let mut baz = HashMap::new();
    let f = Foo {
        id: 1,
        other_data: 2,
    };
    let (k, v) = f.into_key_value();
    baz.insert(k, v);

    // See also HashMap::get_key_value
    if let Some(v) = baz.get_mut(&1) {
        *v = 3;
    }
}
Jmb
  • 18,893
  • 2
  • 28
  • 55
Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
-2

I believe unsafe code is the best route in this case.

impl Foo {
    fn set_other_data(set: &mut HashSet<Foo>, id: &Id, data: u32) -> bool{
        match set.get(id) {
            Some(x) => {
                let p: *const Foo = x;
                let q: *mut Foo = p as *mut Foo;
                unsafe {
                    (*q).other_data = data;
                }
                return true;
            }
            None => return false,
        }
    }
}

fn main() {
    let mut baz = HashSet::new();
    baz.insert(Foo {
        id: 1,
        other_data: 2,
    });

    Foo::set_other_data(&mut baz, &1, 3);
    assert_eq!(3, baz.get(&1).unwrap().other_data);
}

As Shepmaster quoted:

It is a logic error for an item to be modified in such a way that the item's hash, as determined by the Hash trait, or its equality, as determined by the Eq trait, changes while it is in the set. This is normally only possible through Cell, RefCell, global state, I/O, or unsafe code.

In this case other_data is not used by the Hash or Eq traits. So it can safely be mutated. The largest danger is that at a later point Hash for Foo or Eq for Foo would be modified and include other_data.

There is no danger of data races because the HashSet<Foo> is mutably borrowed.

Other options:

Decomposing: This works when Foo has only 2 elements, but suppose Foo contains many elements. Do you decompose Foo into all of its individual elements (seems messy) or create sub-structs within Foo (code bloat).

Encapsulation: Silvio Mayolo suggested encapsulating Foo in a HashSet like interface that internally uses HashMap. This keeps the API clean and uses only safe code, but seems like more programming then is necessary.

I would appreciate your feedback and if this seems reasonable I can put in a feature request for an unsafe fn get_mut() for HashSet.

HiDefender
  • 2,088
  • 2
  • 14
  • 31
  • I believe this code cause undefined behavior and cannot recommend it. Here's a [small example](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=9ae49d432427a44e6f57cf8151b1ea95) where Miri detects violations. – Shepmaster Apr 03 '19 at 19:50
  • no you are not allowed to do this, you can't mutate something that should be const `set.get(id)` – Stargateur Apr 03 '19 at 23:31
  • @Shepmaster Miri complains that `error[E0080]: constant evaluation error: borrow being accessed (Alias(None)) does not exist on the borrow stack`. However, there is no fear of any other thread mutating the `HashSet` in the code I gave above because the `HashSet` is mutably borrowed: `set: &mut HashSet`. I don't see what the undefined behavior would be. – HiDefender Apr 04 '19 at 19:11
  • @HiDefender threading has nothing to do with undefined behavior. The Rust compiler is allowed to assume that *nothing* mutates the referred-to value of an immutable reference, unless there's an [`UnsafeCell`](https://doc.rust-lang.org/std/cell/struct.UnsafeCell.html): *In general, transmuting an `&T` type into an `&mut T` is considered undefined behavior.* – Shepmaster Apr 04 '19 at 19:14