1

I'm using a Dictionary (associative array, hash table, any of these synonyms).

The keys used to uniquely identify values are fairly long strings. However, I know that these strings tend to differ at the tail, rather than the head.

The fastest way to find a value in a JS object is to test the existence of object[key], but is that also the case for extremely long, largely similar, keys (+100 chars), in a fairly large Dictionary (+1000 entries)?

Are there alternatives for this case, or is this a completely moot question, because accessing values by key is already insanely fast?

Sebastien Daniel
  • 4,649
  • 3
  • 19
  • 34
  • maybe you can group the stuff, like splitting in parts of 50 characters, or so. – Nina Scholz May 18 '16 at 13:20
  • Dictionary first tries *hash code* and only then *equality*; do you have problems with *hash codes*? E.g. too many *hash collisions* etc.? – Dmitry Bychenko May 18 '16 at 13:20
  • @DmitryBychenko no hash collisions, the key:values are always unique. – Sebastien Daniel May 18 '16 at 13:30
  • @NinaScholz you mean breaking my keys down, to eliminate potential redundant "head" portions? – Sebastien Daniel May 18 '16 at 13:32
  • @Sebastien Daniel: *keys* itself are *uinque*, no doubt, but what about their `hashCode`? Are they *unique*? Do you have to many equal hash codes (collisions?). E.g. keys (`a`, `b`, `c`) can be unique, bu their codes can badly collide: `123`, `123`, `123`. – Dmitry Bychenko May 18 '16 at 13:33

2 Answers2

0

Long story short; It doesn't matter much. JS will internally use a hash table (as you already said yourself), so it will need to calculate a hash of your keys for insertion and (in some cases) for accessing elements.

Calculating a hash (for most reasonable hash functions) will take slightly longer for long keys than for short keys (I would guess about linearly longer), but it doesn't matter whether the changes are at the tail or at the head.

You could decide to roll your own hashes instead, cache these somehow, and use these as keys, but this would leave it up to you to deal with hash collisions. It will be very hard to do better than the default implementation, and is almost certainly not worth the trouble.

Moreover, for an associative array with only 1000 elements, probably none of this matters. Modern CPUs can process close to / around billions of instructions per second. Even just a lineair search through the whole array will likely perform just fine, unless you have to do it very very often.

TinkerTank
  • 5,685
  • 2
  • 32
  • 41
0

Hash tables (dictionary, map, etc.) first check for hash code, and only then, if necessary (in case of collision - at least two keys have the same hash code) perform equals. If you experience performance problems, the first thing you have to check, IMHO, is hash codes collision. It may appear (bad implementation or weird keys) that the hash code is computed on, say, 3 first chars (it's a wild exaggeration, of course):

  "abc123".hashCode() ==  
  "abc456".hashCode() ==
  ...
  "abc789".hashCode()

and so you have a lot of collisions, have to perform equals, and finally slow O(N) complexity routine. In that case, you have to think over a better hash.

Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215