2

I need to use Map in Javascript and my keys are binary data (Buffer in node.js). Because Map does not support custom key comparison, it looks like the only way to do this is to convert the keys to strings and use strings as keys. I would like to make my keys as short as possible to save memory. The string content of the string keys doesn't matter as long as they provide 1-to-1 mapping with the binary keys. What would be the best way to do this?

One obvious one is to use Buffer.toString('base64'), however it will use 4 characters for every 3 bytes and considering that in Javascript strings are in utf-16, meaning mostly 2 bytes per character, the memory size expansion will be 8/3 compared to size of binary keys.

Is it possible to do better?

It would be nice if we could pack 2 bytes into every character and thus use the full capacity of the string. Would using Buffer.toString('utf16le') work in all cases? I m afraid there may be some byte sequences invalid in utf-16 on which data will be substituted or exception will be thrown. Is there any other way perhaps?

Edit:

I just thought of another possibility to construct the Map keys from binary, which is to use bigint type. It seems that bigints are compared by value and can be used as Map keys. In addition there seems to be no limit on size of bigint. I can construct bigint from binary data by repeated bitwise ors and shifts. If I m not mistaken, the size of bigint constructed in such a way would be the same as size of my binary data.

I know that this is not the expected use of bigints and its seems like a hack, so is using string actually. In any case, would using bigint here be a viable option, performance-wise? (e.g. equality comparison of bigints as map keys and using bigints of fairly big size, up to a Kb or more)?

Yevgeniy P
  • 1,480
  • 1
  • 15
  • 23
  • Why don't you use [WeakMap](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/WeakMap) instead and the buffer as a key ? – Teneff May 19 '20 at 03:29
  • 1
    Doesn't WeakMap compares object keys by reference just like a Map? I need to compare keys by content (i.e. 2 keys are equal if they store the same binary data). – Yevgeniy P May 19 '20 at 04:14

1 Answers1

0

I find your idea, that is, using the primitive, compared-by-value "bigint" type for the keys, ingenious. And as you said, strings are problematic to store any arbitrary data as-is because they’re in UTF-16; you could use something like base32768 for that.


Currently only three methods are possible, as far as I know:

  • A Map with somehow-encoded-into-"bigint" keys
  • A Map with base32768-encoded "string" keys (data encoded into safe UTF-16 characters)
  • A Proxy object with get/set traps that do exhaustive (“O(n)”) comparison against all Buffer/ArrayBuffer/… data object keys

Not knowing the internal behavior of the Javascript engine, it’s hard to know which one performs better. It is something that needs a benchmark to figure out.


edit

On a second thought... something like buffer.toString("UTF16LE") could work well for map keys. Yes, it will produce invalid UTF-16 unit sequences with lone surrogates and stuff, but apparently Javascript string literals can have those intact as long as you don't apply some string manipulation methods on them. It'll do just fine for map keys, I guess.

Just don't forget to differentiate odd and even numbers of source bytes when en/decoding with some kind of padding indicator character, and be aware that the string length is limited to buffer.constants.MAX_STRING_LENGTH.