1

Default function is from std::hash. I wonder if there are better hash functions for saving computational time? for integer keys as well as string keys.

I tried City Hash from Google for both integer and string keys, but its performance is a little worse than std::hash.

Dev2017
  • 857
  • 9
  • 31
SuperBald
  • 109
  • 1
  • 12
  • 2
    Generally speaking, you can write a faster hash function if you know something specific about the data that you're hashing. As a silly example, if you're dealing with only two integer values, 17 and 535, you can hash them to 0 and 1 trivially, and that will be faster than any hash function that deals with the full range of integer values. So what is special about the values that you're hashing? – Pete Becker Feb 20 '17 at 13:01
  • 1
    closing a question if you problem is solved is always a good idea :) – Dev2017 Mar 12 '17 at 10:12

2 Answers2

7

std::hash functions are already good in performance. I think you should try open source hash functions.

Check this out https://github.com/Cyan4973/xxHash. I quote from its description: "xxHash is an Extremely fast Hash algorithm, running at RAM speed limits. It successfully completes the SMHasher test suite which evaluates collision, dispersion and randomness qualities of hash functions. Code is highly portable, and hashes are identical on all platforms (little / big endian)."

Also this thread from another question on this site: Fast Cross-Platform C/C++ Hashing Library. FNV, Jenkins and MurmurHash are known to be fast.

Community
  • 1
  • 1
Dev2017
  • 857
  • 9
  • 31
2

You need to explain 'better' in what sense? The fastest hash function would be simply use the value, but that is useless. A more specific answer would depend on your memory constraints and what probabilities of collision are you willing to accept.

Also note that the inbuilt hash functions are built differently for different types, and as a result, I expect the hash functions for int and string to already by optimised in the general sense for time complexity and collision probability.

therainmaker
  • 4,253
  • 1
  • 22
  • 41
  • My goal is to reduce the overall CPU. So, I guess (1) hash calculation itself is fast; (2) collision is low. Not sure if I am on the right track. – SuperBald Feb 20 '17 at 05:25
  • @superbald: the question is why do you think you can do better? The standard library functions were written by extremely clever programmers intent on producing the best library functions possible. If you can do better, it is probably because you know something about your data which can be exploited ro improve the performance for that particular dataset. But you haven't provided any clue about what might make your keys easier to hash. The standard library implememtation must perform well on a wide range of data, and if nothing makes yours special, it will perform well with yours, too. – rici Feb 20 '17 at 05:32
  • Being in STL does not mean it is the best. e.g., Google open sourced hash map and tree map that are often better than std. I am not sure about hash function. So, asked this question. – SuperBald Feb 20 '17 at 06:11