Hash table implementation for GPU

Question

I am looking for a hash table implementation that I can use for CUDA coding. are there any good one's out there. Something like the Python dictionary . I will use strings as my keys

I tested an implementation of the md5 hash algorithm on the GPU the other day. You could use it to compute hashes for your data and then store them in map. — karlphillip, Sep 19 '11 at 16:42
What? You want to store data on the GPU? No, no.. use it only for processing! The map itself (the one connects the hashes with the original data) should be stored on the RAM. So if you are programming in C++ you can use something like `std::map`, where `std::string` is the hash computed by the GPU and `pointerToTheOriginalData` is... exactly that. — karlphillip, Sep 19 '11 at 17:56
@karlphillip Sometimes the processing means to use the hash map, e.g. currently I am working on LZ77 on GPU. There are more operations that can be done on GPU than matrix multiplication and raycasting. — Radim Vansa, Oct 02 '11 at 19:52

score 4 · Answer 1 · answered Sep 19 '11 at 18:55

4

Alcantara et al have demonstrated a data-parallel algorithm for building hash tables on the GPU. I believe the implementation was made available as part of CUDPP.

That said, you may want to reconsider your original choice of a hash table. Sorting your data by key and then performing lots of queries en masse should yield much better performance in a massively parallel setting. What problem are you trying to solve?

answered Sep 19 '11 at 18:55

Jared Hoberock

11,118
3
40
76

What does "performing lots of queries en masse mean?". I am trying to store an inverted index in the hash table where the key will be strings and the value will be a list of ints. Given a query term, i will find it in the hash table and retrieve te list – Programmer Sep 20 '11 at 04:02
do you have a small code snippet where you have used the cudpp hash table. will be very helpful – Programmer Sep 20 '11 at 04:06
Moreover, why dont you suggest cudpp – Programmer Sep 21 '11 at 05:04
Is there something similar for OpenCL? – ethanjyx Dec 01 '14 at 02:04
1

What did you mean by "Sorting your data by key and then performing lots of queries en masse should yield much better performance in a massively parallel setting"? For example, I need access to a few hundred keys by many threads of a wide range. Sure, I could sort them, but that doesn't give me O(1) access, this data is not updated often. Even if I had a table of 1k entries and only a quarter were actually filled, such a table would be more performant than trying to binary search through a list a fraction of the size (8 memory accesses vs 1*c). – Krupip Mar 10 '20 at 19:56
BTW, `warpcore` is a framework for creating high-throughput, purpose-built hashing data structures on CUDA-accelerators. Hashing at the speed of light on modern CUDA-accelerators. You can find it here: https://github.com/sleeepyjack/warpcore – Mojtaba Valizadeh Mar 19 '22 at 01:14

score 2 · Answer 2 · edited May 23 '17 at 10:33

When I wrote an OpenCL kernel to create a simple hash table for strings, I used the hash algorithm from Java's String.hashCode(), and then just modded that over the number of rows in the table to get a row index.

Hashing function

uint getWordHash(__global char* str, uint len) {
  uint hash = 0, multiplier = 1;
  for(int i = len - 1; i >= 0; i--) {
    hash += str[i] * multiplier;
    int shifted = multiplier << 5;
    multiplier = shifted - multiplier;
  }
  return hash;
}

Indexing

uint hash = getWordHash(word, len);
uint row = hash % nRows;

I handled collisions manually of course, and this approach worked well when I knew the number of strings ahead of time.

Hash table implementation for GPU

2 Answers2

Hashing function

Indexing

Linked

Related