I'm writing a Haxe C# target, and I've been studying performance differences for Haxe's std library so we can provide the best performance possible through its cross platform code.
One very good example is for the hash table code. I was a little reluctant about using .NET's Dictionary, as it seems bulky (structs for key/value pairs can take up a huge amount of memory because of memory alignment issues, besides from unnecessary information held by it), and since on the std library there is no such thing as an object hash, I really thought I could squeeze a little performance by not having to call GetHashCode, and inline it all along.
Also it's clear that the Dictionary implementation uses a linked list to deal with collisions, which is far from ideal.
So we started to implement our own solution, starting with IntHash (Dictionary) We first implemented Hopscotch hashing, but it really didn't turn out very well, but it was kind of obvious that it wouldn't support very well huge hash tables, since H is normally a machine word, and as H / Length increases, the poorer the performance.
We then jumped to implement a khash-inspired algorithm. This one had much potential, as its benchmarks are impressive, and it handles collisions on the same array. It had also some great things, like resizing without needing twice as memory as we would.
The benchmarks were disappointing. Of course, there is no need to say that memory usage was much lower on our implementation than Dictionary's. But I was hoping to get a nice performance boost also, but that was not the case, unfortunately. It wasn't too far below - less than an order of magnitude - but for both sets and gets, .NET's implementation still performed better.
So my question is: is that the best we have for C#? I tried looking for any custom solution, and it seems there is almost none. There is that C5 generic collection, but the code is so cluttered I did not even test. And I found no benchmark also.
So... Is that it? Should I just wrap around Dictionary<>
?