0

I'm looking for a data structure that can possibly outperform Dictionary<string, object>. I have a map that has N items - the map is constructed once and then read many, many times. The map doesn't change during the lifetime of the program (no new items are added, no items are deleted and items are not reordered). Because the map doesn't change, it doesn't need to be thread-safe, even though the application using it is heavily multi-threaded. I expect that ~50% of lookups will happen for items not in the map.

Dictionary<TKey, TItem> is quite fast and I may end up using it but I wonder if there's another data structure that's faster for this scenario. While the rest of the program is obviously more expensive than this map, it is used in performance-critical parts and I'd like to speed it up as much as possible.

xxbbcc
  • 16,930
  • 5
  • 50
  • 83
  • 1
    Possibly related: [Is there a read-only generic dictionary available in .NET?](http://stackoverflow.com/questions/678379/is-there-a-read-only-generic-dictionary-available-in-net) – M.Babcock Mar 09 '12 at 03:04
  • @M.Babcock: thanks but your link is not related. I'm not trying to make a map read-only - my application constructs it once and then never changes it. I'm trying to find possibly faster data structures for the scenario above. If there isn't any, I'm fine with `Dictionary` but I'm also curious if it's possible to speed it up. I also don't need a generic map - the key is `string` and the item is one of my (specific) classes. – xxbbcc Mar 09 '12 at 03:07
  • I understood that, but wasn't sure if the implementation would be more efficient. If it isn't then no big deal. That implementation is slightly thinner than the standard `Dictionary` though so I'd guess it is _slightly_ (albeit unnoticeably) faster. – M.Babcock Mar 09 '12 at 03:11
  • @M.Babcock: ah, ok. I misunderstood it. I looked at the link again but it internally still uses a `Dictionary`. – xxbbcc Mar 09 '12 at 03:26
  • I overlooked that as well, but it may also guide you towards implementing your `IDictionary`. – M.Babcock Mar 09 '12 at 03:28
  • 1
    [This](http://blog.bodurov.com/Performance-SortedList-SortedDictionary-Dictionary-Hashtable/) may help you. It compares the performance of several common alternatives. – M.Babcock Mar 09 '12 at 03:30
  • [http://en.wikipedia.org/wiki/Trie](http://en.wikipedia.org/wiki/Trie) – boardmapping Mar 09 '12 at 03:11

3 Answers3

3

What you're looking for is a Perfect Hash Function. You can create one based on your list of strings, and then use it for the Dictionary.

The non-generic HashTable has a constructor that accepts IHashCodeProvider that lets you specify your own hash function. I couldn't find an equivalent for Dictionary, so you might have to resort to using a Hashtable instead.

You can use it internally in your PerfectStringHash class, which will do all the type casting for you.

Note that you may need to be able to specify the number of buckets in the hash. I think HashTable only lets you specify the load factor. You may find out you need to roll your own hash entirely. It's a good class for everyone to use, I guess, a generic perfect hash.

EDIT: Apparantly someone already implemented some Perfect Hash algorithms in C#.

zmbq
  • 38,013
  • 14
  • 101
  • 171
  • Thank you - for this, it's a good idea that I didn't know before. I probably won't implement this at this time, because timings showed that Dictionary is remarkably fast and our deadline is coming up so it's unlikely that we'd have time to do this but I like learning new things anyway. – xxbbcc Mar 13 '12 at 03:26
0

The read performance of the generic dictionary is "close to O(1)" according to the remarks on MSDN for most TKey (and you should get pretty good performance with just string keys). And you get this out of the box, free, from the framework, without implementing your own collection.

http://msdn.microsoft.com/en-us/library/xfhwa508(v=vs.90).aspx

Val Akkapeddi
  • 1,173
  • 10
  • 17
  • Thanks, I know it's pretty fast. I'm not necessarily looking for something to replace it but I definitely want to know if there are even faster structures. – xxbbcc Mar 09 '12 at 03:18
  • 1
    Ah, well, if you do end up doing any retrieval-speed benchmarking using a .NET trie implementation vs the generic dictionary, it'd be an interesting result to note in this thread. – Val Akkapeddi Mar 09 '12 at 03:31
  • I'll do that. I measured the speed of `Dictionary` with `string` keys and it's pretty impressive. – xxbbcc Mar 09 '12 at 03:39
0

If you need to stick with string keys - Dictionary is at least very good (if not best choice).

One more thing to note when you start measuring - consider if computation of hash itself has measurable impact. Searching for long strings should take longer to compute hash. See if items you want to search for can be represented as other objects with constant get hash time.

Alexei Levenkov
  • 98,904
  • 14
  • 127
  • 179