0

I have this very simple POD struct with 3 shorts in it, and I'm planning to use it hundreds of millions of times in a unordered_set.

Here's is what I'm doing so far:

struct ps3 {
    short v1;
    short v2;
    short v3;

    bool operator==(const ps3& other) const {
        return v1 == other.v1
            && v2 == other.v2
            && v3 == other.v3;
    }
}

// Hash function:
size_t operator()(const ps3& p) const {
    return (static_cast<size_t>(*reinterpret_cast<const int*>(&p)) << 16) | p.v3;
}

The hash function simply returns the value of ps3, as it can fit inside 8 bytes.

(I saw here that for the basic types the standard hash function just returns itself, so if it was a int with a value of 30, it would return a hash of 30)

I return the value of ps3 by getting the first four bytes, shifting by 16, and then ORing the last short.

I was wondering if this method is good, and if there is anything I can do to improve performance (as it is used hundreds of millions of times, potentially billions)

Benchmarks

I did some benchmarking, and heres what I found out:

  • Current method: 1076ms
  • memcpy: 1500ms
  • Combining each short using hash<short>() as @rturrado suggested: 876ms
  • `*reinterpret_cast(&p)) << 16` is undefined behavior. I would suggest using `memcpy` to copy the struct into a `unsigned long long` and then hash/return that integer. – NathanOliver Jul 15 '21 at 17:26
  • I'm pretty sure this usage of `reinterpret_cast` violates strict aliasing rules, meaning this code has Undefined Behavior. The proper way to do this operation would be to use [`std::memcpy`](https://en.cppreference.com/w/cpp/string/byte/memcpy). – 0x5453 Jul 15 '21 at 17:26
  • 1
    In the book A Tour of C++, they recommend combining existing hash functions using exclusive-or for your struct members. In your case, it would be something like `hash()(v1) ^ hash()(v2) ^ hash()(v3)`. – rturrado Jul 15 '21 at 17:28
  • But wouldn't calling `memcpy` for just 6 bytes of data potentially billions of times be bad on performance? – NewUwpPerson Jul 15 '21 at 17:29
  • 1
    Most likely it wont be. All the of the data should be in cache/registers so it should be pretty quick. – NathanOliver Jul 15 '21 at 17:30
  • 1
    If you're worried about performance, don't use a `std::unorderd_set`. It's basically a `std::vector` and `std::list` is one of the worst containers out there performance wise. – NathanOliver Jul 15 '21 at 17:31
  • @NewUwpPerson Modern compilers know that `memcpy` is the standard way to alias types, and they optimize for this case to avoid a true copy. As a result it is quite possible that `memcpy` will be *faster* than `reinterpret_cast`. – 0x5453 Jul 15 '21 at 17:33
  • Sometimes you just write the code the simplest, most straightforward way you can think of, let the compiler optimize, and see if you need to make improvements by profiling. Compiler's pretty smart. – user4581301 Jul 15 '21 at 17:33
  • Thanks! I'll try memcpy – NewUwpPerson Jul 15 '21 at 17:34

1 Answers1

0

Following the comments from A Tour of C++, a possible implementation of your own hash would be:

struct ps3 { short v1; short v2; short v3; };

struct ps3_hash {
    size_t operator()(const ps3& other) const {
        return hash<short>()(other.v1) ^ hash<short>()(other.v2) ^ hash<short>()(other.v3);
    }
};

unordered_set<ps3, ps3_hash> my_set;  // set of ps3s using ps3_hash for lookup
rturrado
  • 7,699
  • 6
  • 42
  • 62