How does google's sparse hash table handle collisions?

Question

How does google's sparse hash table handle collisions? i.e. when 2 elements map to the same bucket, how does it decide where to put the new (colliding) element? I'm reading What is the main implementation idea behind sparse hash table? but that answer doesn't cover the collision idea.

score 3 · Answer 1 · answered Jul 22 '15 at 11:27

3

Your question is answered in the documentation here, specifically:

2c) If t.sparsetable[i % 32] is assigned, but to a value other than foo, look at t.sparsetable[(i+1) % 32]. If that also fails, try t.sparsetable[(i+3) % 32], then t.sparsetable[(i+6) % 32]. In general, keep trying the next triangular number.

You can read about triangular numbers here.

answered Jul 22 '15 at 11:27

Tony Delroy

102,968
15
177
252

I am quite surprised that sparse hash set and dense hash map both will double the size of the sparse table by reallocating (when the table is "too full"), which will involve copying all the elements again, which make things slow. But dense hash set is pretty fast by comparing to other implementations. Thanks for sharing. – HCSF Jul 20 '19 at 02:57
@HCSF: it's quite common for containers - most C++ `std::vector` implementations will do that same. Summarily, it means you have average ~2 insertion/emplace/move/copy performed per element. Details: if you're inserting *N* elements, and end up with a capacity C that's a power of 2 >= N, then elements at [C/2..N-1) would have been inserted/emplaced and not copied/moved, elements as [C/4..C/2) were only copied/moved once, elements at [C/8..C/4) were copied/moved twice, etc.: general shape of it is N/2 + 2*N/4 + 3*N/8 + 4*N/16 + .... Balances memory waste vs cache locality vs rehashing cost. – Tony Delroy Feb 23 '20 at 13:35

How does google's sparse hash table handle collisions?

1 Answers1