Fastest type of hash map

Question

I need to hash ~15000 unsigned integers that are sampled from the space of 2^12 on the low end, to as much as 2^32 on the high end. I also need to store the index for a reverse lookup. A naive example using C++ STL would be:

std::map<unsigned int, std::set<unsigned int /* unique indices */> > m;

In the dense case, we could think of this as:

std::vector<std::set<unsigned int /* unique indices */> > v;

Now for the problem. Speed is the most important factor here, but I"m still limited in terms of memory. I'll need to store 1000's of these maps in memory and access at a high rate in a low latency application. A query should be on the order of nano-seconds.

I currently store the data using the dense method. However, I'd like to increase the range of keys that need to be hashed to 2^32 which makes the dense method problematic. Remember, I only need to store ~15000 keys in the map.

On the plus side, once the map is built, I will never insert anything into it again. I will only query it afterwards. The insertions still needs to be reasonably fast, but not as critical as the query.

Some code I have experimented with are:

Google SparseHash
Google DenseHash
STL unordered_map
STL map

I don't mind writing my own hash table. I wanted to get some expert advice before tackling it myself.

Are you implying that none of the existing libraries are fast enough? — Oliver Charlesworth, Jan 01 '13 at 17:44
The dense version is fast enough. However, it consumes a lot of memory when the keys are sampled from a space of 2^32 or higher. — paul, Jan 01 '13 at 17:46
I'd like to know if there are any tricks I can use if I know the map will not change after it's built. — paul, Jan 01 '13 at 17:48
1) How much memory can you afford to spend on it ? 2) the nano-scond requirement is to retrieve only one item ? 3) I don't understand the "reversed" lookup, what are the keys, and what is the payload ? — wildplasser, Jan 01 '13 at 18:37
Yes. Only one item. I don't have a number on the memory requirement yet. However, this is for a mobile phone. — paul, Jan 01 '13 at 18:38
Se my answer here : http://stackoverflow.com/a/8104713/905902 is a bit different, but intended to be memory-efficient. Modifying it to be a double-entry hash-table would not be too difficult (but you'd lose the locality of reference feature for one of the keydomains) — wildplasser, Jan 01 '13 at 18:41
If you have thousands of these maps, and each one must hold 15 K numbers, and those numbers are 32-bit numbers before hashing, then each map is going to occupy more than 60 KiB, and a thousand of these maps is going to occupy 60 MiB. And that ignores the reverse lookup and largely overlooks per entry overhead. Are you sure you can afford to do this? Or is this back of the envelope calculation misunderstanding something? — Jonathan Leffler, Jan 01 '13 at 18:42
But does the payload exist of a *set* of integers ? Cant you use a bitmap (or a compressed bitmap or Judy tree) to represent them? More info, please; the description is a bit too vague, IMHO. — wildplasser, Jan 01 '13 at 18:50
@wildplasser Thanks. I'm not familiar with Judy Trees. I'll take a look. This is actually for a computer vision problem. I actually have a larger vector (512 bits) that I need to hash. So I sample a random subset of these bits and create a key. Since the 512 bit vector may contain noise on the query, I sample N bits M times to increase the chances of a k-nearest neighbor ranking. This is similar to Locality Sensitive Hashing. The N bits that are sampled generate a key. It's this key that I need to store in a hash table along with the reverse index. — paul, Jan 01 '13 at 19:11
@wildplasser The reverse lookup is the index of the larger vector (512 bits) that is stored elsewhere. — paul, Jan 01 '13 at 19:16
Instead of explaining your partial (and possibly wrong) solution, you should present the actual problem. It is impossible for us to distinguish between actual requirements and (false) assumptions. BTW: what is a vector? just a 512 bit entity? — wildplasser, Jan 01 '13 at 19:17
@Jonathan I can afford 60 MB, or even 100+ MB. This is probably approaching my limit though. — paul, Jan 01 '13 at 19:19
@wildplasser This problem is part of a much larger computer vision problem. I don't want to get into the details of the computer vision problem. What I have presented is the problem I need a solution to. — paul, Jan 01 '13 at 19:21
@Sancho It is the problem I need a solution to. What I need is a fast way to represent the problem I have stated. If there is another solution that does not fit into the stated problem, then it is a solution to another problem. Not the one I have stated. — paul, Jan 02 '13 at 17:03

kmcguire · Answer 1 · 2013-11-24T19:16:52.643

The average GET operation should be under 1ms ranging from 189ns with 1024 entries (349KB in memory) to 888ns for 27,648 entries (6MB in memory). The maximum latency on an entry with 27k entries is 44,000ns. But, if the average time is what matters for you and not a high latency ever so often then this might be basically what you want. I think it can be optimized further, but not sure of the gains to be made.

typedef unsigned int uintptr;
typedef unsigned int uint32;
typedef unsigned short uint16;
typedef unsigned char uint8;


namespace anything { namespace linklist {
typedef struct _HDR {
    void              *next;
    void              *prev;
} HDR;

void *next(void *ptr) {
    if (ptr == 0) {
        return 0;
    }
    return ((void**)ptr)[0];
}

void add(void **chain, void *toadd) {
    ((void**)toadd)[0] = *chain;
    ((void**)toadd)[1] = 0;         /* set previous */

    /* set previous link if valid pointer */
    if (*chain)
        ((void**)*chain)[1] = toadd;

    *chain = toadd;
}
}}

namespace anything{ namespace hash {
   typedef struct _B {
      MASS_LL_HDR    llhdr;
      uint32         id;
      union {
         struct _B    *chain;
         uintptr      value;
      };
   } B;

   typedef struct _HT {
      B        *buckets;
      uint16   depth;
      uint8    bbl;
   } HT;

   void init(HT *ht, uint8 bbl) {
      ht->buckets = 0;
      ht->bbl = bbl;
   }

   void _free(B **chain, uint16 dcnt, uint16 dcntmax, uint32 *_m) {
      B        *ba, *_ba;

      for (ba = *chain, _ba = 0; ba; ba = _ba) {
         _ba = (B*)mass_ll_next(ba);

         if (dcnt < dcntmax - 1) {
            _free(&ba->chain, dcnt + 1, dcntmax, _m);
            *_m = *_m + 1;
            dfree(ba);
         }
      }

      /* zero the chain out */
      *chain = 0;
   }

   void free(HT *ht) {
      uint32      m;
      uint16      dm;

      dm = (sizeof(uintptr) * 8) / ht->bbl;
      m = 0;

      _free(&ht->buckets, 0, dm, &m);
   }

   int get(HT *ht, uintptr k, uintptr *v) {
      uintptr        a;
      B             *ba, **cur;

      uint16         bi, lcnt;
      uint32         mask;

      lcnt = (sizeof(uintptr) * 8) / ht->bbl;

      cur = &ht->buckets;

      mask = ~(~0 << ht->bbl);

      for (bi = 0; bi < lcnt; ++bi) {

         a = (k >> (bi * ht->bbl)) & mask;

         for (ba = *cur; ba; ba = (B*)mass_ll_next(ba)) {
            if (ba->id == a) {
               break;
            }
         }

         if (!ba) {
            return 0;
         }

         cur = &ba->chain;
      }

      *v = ba->value;
      return 1;
   }

   void put(HT *ht, uintptr k, uintptr v) {
      uintptr       a;
      B             *ba, **cur;

      uint16         bi, lcnt;
      uint32         mask;

      lcnt = (sizeof(uintptr) * 8) / ht->bbl;

      cur = &ht->buckets;

      mask = ~(~0 << ht->bbl);

      for (bi = 0; bi < lcnt; ++bi) {

         a = (k >> (bi * ht->bbl)) & mask;

         for (ba = *cur; ba; ba = (B*)mass_ll_next(ba)) {
            if (ba->id == a) {
               break;
            }
         }

         if (!ba) {
            ba = (B*)dmalloc(sizeof(B));
            ba->id = a;
            ba->chain = 0;
            mass_ll_add((void**)cur, ba);
         }

         cur = &ba->chain;
      }

      ba->value = v;
   }
}}

anything::hash::HT      ht;
anything::hash::init(&ht, 1);
anything::hash::put(&ht, key, value);
if (!anything::hash::get(&ht, key, &value) {
   printf("not found!\n");
}

You can reduce the memory usage to around 900kb per 15000 entries using anything::hash::init(&ht, 4), but this increases the latency.

Fastest type of hash map

1 Answers1