How does google's sparse hash table handle collisions? i.e. when 2 elements map to the same bucket, how does it decide where to put the new (colliding) element? I'm reading What is the main implementation idea behind sparse hash table? but that answer doesn't cover the collision idea.
Asked
Active
Viewed 560 times
1 Answers
3
Your question is answered in the documentation here, specifically:
2c) If
t.sparsetable[i % 32]
is assigned, but to a value other than foo, look att.sparsetable[(i+1) % 32]
. If that also fails, tryt.sparsetable[(i+3) % 32]
, thent.sparsetable[(i+6) % 32]
. In general, keep trying the next triangular number.
You can read about triangular numbers here.

Tony Delroy
- 102,968
- 15
- 177
- 252
-
I am quite surprised that sparse hash set and dense hash map both will double the size of the sparse table by reallocating (when the table is "too full"), which will involve copying all the elements again, which make things slow. But dense hash set is pretty fast by comparing to other implementations. Thanks for sharing. – HCSF Jul 20 '19 at 02:57
-
@HCSF: it's quite common for containers - most C++ `std::vector` implementations will do that same. Summarily, it means you have average ~2 insertion/emplace/move/copy performed per element. Details: if you're inserting *N* elements, and end up with a capacity C that's a power of 2 >= N, then elements at [C/2..N-1) would have been inserted/emplaced and not copied/moved, elements as [C/4..C/2) were only copied/moved once, elements at [C/8..C/4) were copied/moved twice, etc.: general shape of it is N/2 + 2*N/4 + 3*N/8 + 4*N/16 + .... Balances memory waste vs cache locality vs rehashing cost. – Tony Delroy Feb 23 '20 at 13:35