When to use hash data structure?

Question

I have a collection of objects(max 500).

My entries will be looked up frequently based on a MAC like key, whose range is unknown.

Now, I am confused as to which data structure and algorithm to use for effective look up of values.

I am not sure whether to go for a balanced BST (AVL) or a hashtable for this case.

Are 500 keys small for building hashtables?

What would be the best approach in my case?

I read that computing hash might prove costly when the number of keys is less

On a side note, I would also like to know what number of entries (min) need to be present for considering a hashtable?

Please add a comment if further details are needed.

Thanks in advance.

Unless you will build your data structures from scratch, which I discourage anyway, I suggest you to benchmark both. — Giulio Franco, Jan 07 '15 at 08:38
A hash-table gives you an O(1) lookup time; a BST gives you an O(log N) lookup time. You have to weigh the cost of computing the hash against the cost of multiple MAC comparisons. On the whole, I suspect a hash table is going to work well. — Jonathan Leffler, Jan 07 '15 at 08:40
How would you define *best*? Hashtables are usually faster, but only if you have good algorithm for hash generation, and suitable collision handling. But do you need either? You could start with `qsort`+`bsearch`, and it might be good enough. — user694733, Jan 07 '15 at 08:42
A common question with lots of answers on SO. [Here's one](http://stackoverflow.com/questions/371136/binary-trees-vs-linked-lists-vs-hash-tables). — Andy Brown, Jan 07 '15 at 08:42
Thanks everyone. @user694733 I thought of going for qsort+bsearch, but in case of dynamic insertion/deletion it may take a hit. need to insert and sort. thought AVL provides more edge there, since insertion, deletion and search are O(log N) — Nataraj, Jan 07 '15 at 09:15
@AndyBrown thanks a lot. That link sure did help. More comparisons. :) — Nataraj, Jan 07 '15 at 09:28

score 2 · Accepted Answer · answered Jan 07 '15 at 08:39

Below are some of the benefits of Hash structures,

Fast lookup (O(1) theoretically)
Efficient storage (helps to store key-value)

Though these properties are beneficial but in some scenarios hash table can underperform.

If you have large amount of objects then more storage space (Memory) will be required and thus can cause performance hit
Hashing/key algorithm should not be complex. Otherwise more time will spent on hashing and finding key.
Key collision should be minimal to avoid linear search in all values for single key or key duplication.

In your case if hashing algo is not too complex then definitely you can use hashtable as you only 500 objects. If you have lookup intensive workflow then hashtable can save lot of time. If your data is nearly static then don't worry about initial loading time because your lookup time will be much faster.

You can also look at another DS which are efficient for less values like, Hash set, AVL trees, Hash tree. For 500 objects time diff will be milli or micro second diff in linear search and hash search. thus you wont achieve much perf. improvement. Thus look for easiness and readability.

Thanks a lot. My data is static as of now, but may evolve to be dynamic. So thought it would be better considering that case also before coming up with a DS/Algo. From what you and others suggest, looks like Hash will work fine for my case. — Nataraj, Jan 07 '15 at 09:19
The number of insertions/deletions will definitely be << number of lookups — Nataraj, Jan 07 '15 at 09:23

When to use hash data structure?

1 Answers1