1

I was wondering what would be the optimal way to calculate the hash, considering that the values of ptime that are used as the key differ mostly in the hour and date (minutes and seconds are usually 0).

I have done this but I feel that it is quite ugly and slow:

namespace std
{
    /**
     * Specialize std::hash for ptime
     */
    template<>
    class hash<boost::posix_time::ptime>
    {
    public:
        size_t operator()(const boost::posix_time::ptime& t) const
        {
            const auto dt = t.date();
            const auto ho = t.time_of_day().hours();
            return hash<int>()(dt.day_number()) ^ hash<int>()(ho);
        }
    };
}
Marc Mutz - mmutz
  • 24,485
  • 12
  • 80
  • 90
huff
  • 1,954
  • 2
  • 28
  • 49

1 Answers1

4

The keywords you should be looking are the "avalanche effect" and "hash combine".

You probably should not design the hash function by yourself, since this area is thoroughly researched and studied. Just choose the function with a good avalanche effect, for example, MurmurHash.

Since you're already using boost, the boost::hash_combine might be the most appropriate and useful solution for you (also mentioned here):

friend std::size_t hash_value(point const& p)
{
    std::size_t seed = 0;
    boost::hash_combine(seed, p.x);
    boost::hash_combine(seed, p.y);
    return seed;
}

More important, instead of using day_number and hours, you can use something like total_nanoseconds() or even get down to the internal system type and use that value for hashing, avoiding the artifical range decrease when you convert the real timestamp to days/hours.

Community
  • 1
  • 1
Vladimir Sinenko
  • 4,629
  • 1
  • 27
  • 39
  • Isn't total_nanoseconds() from time_duration -not ptime? therefore it would be only possible to get it from time_of_day(), isn't it? – huff Mar 13 '15 at 07:08
  • You're correct, I looked at the wrong paragraph in documentation. But you got the idea: it's better to expand the hash input range to something like milliseconds or ticks, so the output is better distributed. – Vladimir Sinenko Mar 13 '15 at 08:35