923

I need to convert strings to some form of hash. Is this possible in JavaScript?

I'm not utilizing a server-side language so I can't do it that way.

John
  • 1
  • 13
  • 98
  • 177
Freesnöw
  • 30,619
  • 30
  • 89
  • 138
  • 2
    You want to look up something like [Javascript md5](http://pajhome.org.uk/crypt/md5/index.html). – rlemon Sep 30 '11 at 21:55
  • 9
    MD5 is not secure, so don't look for that one. – henrikstroem Aug 20 '13 at 20:30
  • 267
    @henrikstroem Depends on what you are hashing; there's nothing wrong with using md5 to make a hash for non-security purposes. – Brad Koch Jun 26 '14 at 17:18
  • 14
    @BradKoch Depends on what you are doing; there's nothing wrong for using md5 for security purposes. There are certainly better methods for hashing passwords, but md5 is just fine for doing things like signing a URL. – Paul Ferrett Mar 16 '15 at 07:05
  • 123
    I find it funny that while MD5 is criticised in comments here, almost all answers recommend much worse hash algorithms and get lots of upvotes. – domen Jan 28 '16 at 14:35
  • 1
    An implementation of [Jenkins's one-at-a-time hash](http://en.wikipedia.org/wiki/Jenkins_hash_function) `window.hashJoaat=function(b){for(var a=0,c=b.length;c--;)a+=b.charCodeAt(c),a+=a<<10,a^=a>>6;a+=a<<3;a^=a>>11;return((a+(a<<15)&4294967295)>>>0).toString(16)};` – Orwellophile Oct 15 '16 at 16:00
  • @henrikstroem Most hash tables use far less secure algorithms. Because they need *different* properties from the algorithm. – iPherian Apr 26 '17 at 22:27
  • @iPherian Yes, that's true - good point. The reason for my comment is that the question didn't put the usage in context, and there is usually no reason for using MD5 over better algorithms. – henrikstroem Apr 27 '17 at 11:41
  • 1
    Here is a js md5 function https://github.com/blueimp/JavaScript-MD5 and here is the demo https://blueimp.github.io/JavaScript-MD5/ works great for me. – dav Feb 21 '15 at 04:48
  • 90
    Using MD5 to verify that a download came intact is not magically going to email your passwords to all your co-workers. – James M. Lay Jan 06 '18 at 01:28
  • 6
    @henrikstroem I came here looking for a hash to pick a colour – Martin Capodici Nov 20 '18 at 03:49
  • 1
    For a built-in *real* hash solution see [Kaiido's answer](https://stackoverflow.com/a/43383990/9063935) – dwb Nov 24 '20 at 10:39
  • 3
    I came here to create unique values for `key` property of ReactJS elements. Some objects don't provide an unique ID (or key) property when inside an array. ReactJS docs suggest avoid make use of index loop because of some unstability of this method. I think a hash, even non-secure one, is useful in this case. – Enrique René Dec 08 '20 at 14:54
  • @domen MD5 isn't bad because it is insecure. It is bad because it is both insecure, AND slow. There is no scenario where MD5 could not be replaced with something better suited for the purpose. – Aron Aug 11 '21 at 03:57
  • @Aron could you give some examples? You're replying to a comment that's half a decade old. – Freesnöw Aug 22 '21 at 19:39
  • Note that if you do not need to hash a string specifically but just need a hash, you can use: `window.crypto.getRandomValues(new Uint32Array(1))[0]`, or, for 8 hex character strings: `window.crypto.getRandomValues(new Uint32Array(1))[0].toString(16)` – Andrew Sep 23 '21 at 13:35
  • 1
    @JamesM.Lay But it *could* "magically" email your passwords to all your coworkers, indirectly, if MD5 is broken badly enough that someone could craft a malicious replacement for your download that still hashes to the same MD5 hash, for example by appending some malicious data followed by padding to a PDF, which your PDF reader mishandles in a way that enables a sufficiently flexible remote code execution exploit or whatever. – mtraceur Oct 01 '21 at 18:49
  • 1
    @mtraceur I can't believe I said that three years ago. I agree fully with your point, MD5 should not be used to verify downloads in an untrusted context. I think the point I'd try to drive home now is different - use MD5 for its speed, not for its security. Using MD5 to invalidate a cache or implement a hashmap of sorts, that's probably not going to result in malicious emails. – James M. Lay Oct 11 '21 at 19:15
  • @Aron You're probably right. But - MD5 is advantageous in that it is the fastest well-adopted algorithm. You can find it on basically any platform. – James M. Lay Oct 11 '21 at 19:18

26 Answers26

1041

String.prototype.hashCode = function() {
  var hash = 0,
    i, chr;
  if (this.length === 0) return hash;
  for (i = 0; i < this.length; i++) {
    chr = this.charCodeAt(i);
    hash = ((hash << 5) - hash) + chr;
    hash |= 0; // Convert to 32bit integer
  }
  return hash;
}

const str = 'revenue'
console.log(str, str.hashCode())

Source

Teocci
  • 7,189
  • 1
  • 50
  • 48
esmiralha
  • 10,587
  • 1
  • 16
  • 20
  • 40
    This is the same one used in Java. The `hash << 5 - hash` is the same as `hash * 31 + char` but a LOT faster. It's nice because it's so fast, and 31 is a small prime. Win win there. – corsiKa Sep 30 '11 at 21:59
  • 1
    Very lightweight to add on the page too. I like this. Just wondering what resistance to reversing the hashes this has? sequences of numbers for example result in predictable hashes sometimes. – Dan Sep 30 '11 at 22:07
  • 9
    Thanks, but you have leaked 2 variables into the global namespace. – Zephyr was a Friend of Mine Jul 10 '12 at 18:46
  • This is nice ... I need to generate a hash value for images url to implement some lazy-loading functionality ... This would be perfect if it did generate unsigned values only `:]` Any idea to counter this issue ? – Stphane Dec 04 '12 at 14:42
  • 3
    number based is unusable for larger strings. – Peter Aron Zentai Feb 26 '13 at 23:38
  • 1
    `hash = hash & hash` -- the link seems to be saying that `hashCode` needs to return a 32 bit integer. surely in that case a better approach would be `hash = hash & 0xffffffff`? `Math.pow(2, 33)` bitwise and itself in Chrome returns 0 for me – hdgarrood Apr 03 '13 at 16:39
  • 2
    @hdgarrood -- `Math.pow( 2, 33 ) & === 0` because the `&` op does an implicit cast to a 32-bit int and the last 32 bits of that are 0, so the 1 at bit 34 gets thrown away. Anything using the `&` op will be exactly the same, including `& 0xffffffff` –  May 31 '13 at 00:19
  • 1
    @cwolves I understand! Thank you. Am I right in thinking that `hash |= 0` should convert `hash` to a 32-bit int by keeping the least significant 32 bits? – hdgarrood Jun 01 '13 at 00:55
  • @hdgarrood Correct. An example: `hash = 7264728162427` (which is 69B738AA07B in hex). `hash | 0` will chop off everything above 32 bits (in this case, 69B) to leave 738AA07B, which is 1938464891 in decimal. So you'll find that `(hash | 0) === 1938464891`. – TachyonVortex Jul 19 '13 at 01:12
  • 18
    @PeterAronZentai Why is it "unusable"? The output produced by the number-based code `(hash * 31) + char` is identical to the output produced by the shift-based code `((hash<<5)-hash)+char`, even for very long strings (I've tested it with strings containing over a million characters), so it's not "unusable" in terms of accuracy. The complexity is O(n) for both the number-based and shift-based versions, so it's not "unusable" in terms of complexity. – TachyonVortex Jul 19 '13 at 01:53
  • 26
    Can anyone comment on the uniqueness (or not) of the output? Specifically, if I only use this hash for strings with length less than `n`, what is the largest `n` for which I can't possibly have a collision? – Don McCurdy May 23 '14 at 05:26
  • 4
    How would this be if I needed something that is unsigned? – majidarif Jul 25 '14 at 18:51
  • 60
    Is there any reason this needs (or should) be on the String prototype? Would it be any less effective/efficient to just have eg; `var hashCode = function hashCode (str) {etc...}`? And then use as `hashCode("mystring")`? – rattray Aug 04 '14 at 14:24
  • @rattray I assume because it was adapted from http://werxltd.com/wp/2010/05/13/javascript-implementation-of-javas-string-hashcode-method/. – ericsoco Sep 30 '14 at 22:24
  • 3
    @skerit your Number-based perf test isn't correctly implementing the algorithm as described here: http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#hashCode(). Here's an update: http://jsperf.com/hashing-strings/38 Number-based is significantly slower than bitwise. – ericsoco Sep 30 '14 at 22:31
  • agree with @PeterAronZentai: implemented the method due to it's great performance but for long strings you'll simply get "Infinity" – svub Oct 19 '14 at 23:04
  • I modified hashCode() to make it a little faster: http://jsperf.com/string-hashing-methods (str2hash) – thdoan Feb 06 '15 at 06:40
  • 1
    ES5 version of your answer is `[].reduce.call( str, function( hash, i ) { var chr = i.charCodeAt(0); hash = ((hash << 5) - hash) + chr; return hash | 0; }, 0 );`. – Rafael Xavier Apr 19 '15 at 02:06
  • 10
    I wonder if removing the line `if (this.length == 0) return hash;` will have a major impact on performance. In terms of clean code, in my opinion, it is just noise. – user40171 Jul 06 '15 at 08:55
  • 1
    since performance is a concern, why not O(n) but from the end to the start (making the for to reverse from the last char to the first char) - since there will be less comparisons `i – DATEx2 Oct 13 '15 at 20:43
  • 2
    FWIW I don't think the `hash |= 0` expression is doing what's expected. I have an application where I need to generate matching hashes in both JavaScript and PHP, and so I ported this function to PHP. The hashes from the two functions didn't match, but when I explicitly kept the lower 32 bits (`hash &= 0xffff`) everything matched up. – Scott Means Dec 17 '15 at 01:40
  • 1
    Check this for the same function implemented in ES6: http://stackoverflow.com/a/34842797/3439460 – Deekshith Jan 18 '16 at 22:50
  • 1
    Stupid question maybe, but how do you use this function? :D – OZZIE Apr 12 '16 at 14:10
  • 3
    @OZZIE var hashCode = yourString.hashCode(); – Israel Apr 19 '16 at 09:11
  • 4
    @ScottMeans You wouldn't expect `|= 0` always to have the same result in JS and PHP – in JS it implicitly converts the first operand to a 32-bit integer – not the case in PHP. i.e. The 32-bit integer conversion is a quirk of the JS `|` operator. – Benji XVI Aug 22 '16 at 10:54
  • 6
    create port, thank you. To ensure an always positive numbers (because I don't want negative ones) I changed `return hash;` to `return (hash >>> 0);` – pensan Oct 22 '16 at 17:13
  • @BenjiXVI In retrospect, I don't know why I would have expected it to coerce the value to 32 bits in PHP. Not sure why the author didn't just &= in the first place. – Scott Means Oct 26 '16 at 11:52
  • @pensan using `hash >>> 0` may result in an error, that the value is too large for an Int32. – RiZKiT Feb 10 '17 at 11:08
  • Can anyone tell me what the big O for this is? Please include how to pronounce it when reading it out loud (for example, "oh of en log en") Thank you so much! – Neil Girardi Sep 01 '17 at 17:25
  • 3
    The entropy between two consecutive values is very low! "a" is 97 and b is "98" – Gyum Fox Oct 11 '17 at 12:14
  • Cannot get this to work in C and Pascal, any ideas how to achieve the same results? – Codebeat Jan 31 '18 at 03:02
  • how did you come up with this formula? `hash = ((hash << 5) - hash) + chr;`. I saw this post which is using something different https://github.com/darkskyapp/string-hash/blob/master/index.js @esmiralha – abhimanyuaryan Sep 15 '18 at 22:39
  • 1
    What is the significance of bitshifting to the left 5 times, as opposed a different number? – robinnnnn Oct 29 '18 at 15:10
  • 2
    @robinnnnn Bitshifting to the left 5 times is equivalent to multiplying by 32. It was chosen because it probably resulted in the least amount of collisions during testing. You could get away even with shifting to the left just once, which is multiplying by two. There's nothing special about it. It's not much better than 4 or 6. This isn't even particularly a _good_ hash function. FNV is significantly better, and MurmurHash is even better than FNV. – bryc Dec 09 '18 at 03:18
  • 2
    Remove `if (this.length === 0) return hash;` because `0 < 0 == false` so `for` will skip and it return `0` anyway –  Oct 28 '19 at 18:58
  • @Basj why is the warning? `99162322` and `99162323` are still different, isn't it? – cYee May 27 '21 at 12:44
  • FWIW, you don't need to |0 if you've used n<<5, because the bitwise op converts to int: you only need to do it at the end. Some simple testing suggests this makes it much faster. – Whinger Jul 27 '21 at 09:34
  • Which hashing algorithm is being applied for this hash? – MiraTech Nov 26 '21 at 14:01
  • 1
    @MiraTech Java's built-in `String.hashCode`. Originally created in 1981 for Gosmacs. – bryc Dec 17 '21 at 14:51
  • Note that this might return negative numbers. Slap a `Math.abs` on if you want only positive numbers. Note that this will increase chances of collision, but as long as you aren't using it for security... – Bojan Krkic Feb 04 '22 at 13:46
  • Dont use + or ^ because it has to be - to make a complement, otherwise information is shifted out. When I changed a char the hash number did not change, so I had to think. With bitwise rotation no information is discarded. I ended up with; h = ((h << 5) | (h >>> 27)) ^ s.charCodeAt(i); The bitwise xor is faster than minus that now can be replaced. The rotation cost an extra right shift of the most significant bits to patch over the zeros that is left shifted in: http://www.java2s.com/example/nodejs/number/bitwise-rotate-a-32bit-number-to-the-left.html – Dan Froberg Apr 12 '22 at 23:38
  • Bitwise operators depend on the platform arch. So, on 64bit ones, the result is 64bits, on 32 is 32. In the given answer, `hash |= 0;` doesn't aways round it down to 32 bit int, in the majority of the cases (as of now, you get 64 bit int). If you need 32 bit int only, please change it to `hash &= 0xFFFF;` – Vlad Churakov Jul 02 '22 at 01:42
  • You can improve the performance by using a TextEncoder instead of calling charCodeAt() all the time. new TextEncoder().encode(inputString) – pubkey Jul 25 '22 at 10:51
  • 1
    Do we need this line `if (this.length === 0) return hash;` ? if length = 0 then it won't enter the loop anyway – Digital Alpha Aug 30 '22 at 13:11
  • 3
    @DonMcCurdy "what is the largest n for which I can't possibly have a collision?" I'm afraid it's `n=1`. [`'Aa'.hashCode() == 'BB'.hashCode()` is `true`](https://stackoverflow.com/q/9406775/6419007). – Eric Duminil Sep 29 '22 at 14:13
  • how do you reverse it? unhash it? – Alexander Mills Oct 30 '22 at 21:15
375

Many of the answers here are the same String.hashCode hash function taken from Java. It dates back to 1981 from Gosling Emacs, is extremely weak, and makes zero sense performance-wise in modern JavaScript. In fact, implementations could be significantly faster by using ES6 Math.imul, but no one took notice. We can do much better than this, at essentially identical performance.

Here's one I did—cyrb53, a simple but high quality 53-bit hash. It's quite fast, provides very good* hash distribution, and because it outputs 53 bits, has significantly lower collision rates compared to any 32-bit hash. Also, you can ignore SA's CC license as it's public domain on my GitHub.

const cyrb53 = (str, seed = 0) => {
    let h1 = 0xdeadbeef ^ seed, h2 = 0x41c6ce57 ^ seed;
    for(let i = 0, ch; i < str.length; i++) {
        ch = str.charCodeAt(i);
        h1 = Math.imul(h1 ^ ch, 2654435761);
        h2 = Math.imul(h2 ^ ch, 1597334677);
    }
    h1  = Math.imul(h1 ^ (h1 >>> 16), 2246822507);
    h1 ^= Math.imul(h2 ^ (h2 >>> 13), 3266489909);
    h2  = Math.imul(h2 ^ (h2 >>> 16), 2246822507);
    h2 ^= Math.imul(h1 ^ (h1 >>> 13), 3266489909);
  
    return 4294967296 * (2097151 & h2) + (h1 >>> 0);
};

console.log(`cyrb53('a') -> ${cyrb53('a')}`)
console.log(`cyrb53('b') -> ${cyrb53('b')}`)
console.log(`cyrb53('revenge') -> ${cyrb53('revenge')}`)
console.log(`cyrb53('revenue') -> ${cyrb53('revenue')}`)
console.log(`cyrb53('revenue', 1) -> ${cyrb53('revenue', 1)}`)
console.log(`cyrb53('revenue', 2) -> ${cyrb53('revenue', 2)}`)
console.log(`cyrb53('revenue', 3) -> ${cyrb53('revenue', 3)}`)

*It is roughly similar to the well-known MurmurHash/xxHash algorithms. It uses a combination of multiplication and Xorshift to generate the hash, but not as thorough. As a result it's significantly simpler to implement, but may not pass all tests in SMHasher. This is not a cryptographic hash function, so don't use this for security purposes.

Like any proper hash, it has a fairly acceptable "avalanche" effect, which basically means small changes in the input have big changes in the output, making the resulting hash appear more 'random':

"501c2ba782c97901" = cyrb53("a")
"459eda5bc254d2bf" = cyrb53("b")
"fbce64cc3b748385" = cyrb53("revenge")
"fb1d85148d13f93a" = cyrb53("revenue")

You can optionally supply a seed (unsigned integer, 32-bit max) for alternate streams of the same input:

"76fee5e6598ccd5c" = cyrb53("revenue", 1)
"1f672e2831253862" = cyrb53("revenue", 2)
"2b10de31708e6ab7" = cyrb53("revenue", 3)

Technically, it is a 64-bit hash, that is, two uncorrelated 32-bit hashes computed in parallel, but JavaScript is limited to 53-bit integers. If convenient, the full 64-bit output can be used by altering the return statement with a hex string or array.

return [h2>>>0, h1>>>0];
// or
return (h2>>>0).toString(16).padStart(8,0)+(h1>>>0).toString(16).padStart(8,0);
// or 
return 4294967296n * BigInt(h2) + BigInt(h1);

Be aware that constructing hex strings drastically slows down batch processing. The array is much more efficient, but obviously requires two checks instead of one. I also included BigInt, which should be slightly faster than String, but still much slower than Array or Number.


Just for fun, here's TinySimpleHash, the smallest hash I could come up with that's still decent. It's a 32-bit hash in 89 chars with better quality randomness than even FNV or DJB2:

TSH=s=>{for(var i=0,h=9;i<s.length;)h=Math.imul(h^s.charCodeAt(i++),9**9);return h^h>>>9}
Yves M.
  • 29,855
  • 23
  • 108
  • 144
bryc
  • 12,710
  • 6
  • 41
  • 61
  • 13
    Wow, this is so much better than the usual *31 one for short (or similar) inputs. :) – lapo Oct 03 '18 at 07:59
  • Failed for I.E 11: Object doesn't support property or method `'imul'`. – BachT Mar 18 '19 at 21:01
  • 5
    @BachT You can use a [polyfill](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Math/imul#Polyfill) or a full [ES6 shim](https://github.com/paulmillr/es6-shim). But IE11 is tragically frozen in 2009 with no updates. – bryc Mar 19 '19 at 05:48
  • 4
    @bryc How do you pick those big numbers, which aren't necessarily prime? How do they effect the efficiency and the performance of the result? – Kossi D. T. S. Oct 09 '19 at 09:35
  • 8
    @KossiD.T.S. I didn't pick those multipliers; they were borrowed elsewhere (L'Ecuyer, Knuth, MurmurHash). Usually these numbers are found by clever people via probabilistic testing (e.g. simulated annealing, genetic programming), searching for the best statistical result, tuned for their use case. They don't really affect efficiency, just quality of the hash result. For example, making them even numbers typically breaks everything, but millions of combinations of odd numbers can give decent results. I'm sure even better constants could be found. I never tested this with SMHasher. – bryc Oct 09 '19 at 22:39
  • 2
    @KossiD.T.S. Which explains why I used `9**9` as a multiplier in the TSH hash at the bottom of the answer. It's just a large odd number, 387420489. More scrambling tends to happen, the larger the multiplier is (but 32-bit is the limit of bitwise operations in JS). – bryc Oct 09 '19 at 22:43
  • @bryc Great answer! Could you please offer an open-source license for your `cyrb53` function (e.g. MIT)? – aomarks Nov 18 '20 at 19:57
  • 2
    @aomarks sorry, i don't vibe with middle age royal decrees; it's public domain, take that as you will. maybe if MIT offered me free tuition to endorse them ;) – bryc Nov 18 '20 at 21:58
  • 2
    @bryc Public domain is great! Would you be willing to explicitly grant a public-domain equivalent license for `cyrb53`? Suggested examples: https://opensource.org/licenses/0BSD https://opensource.org/licenses/unlicense The (frustrating) reason for this request is that in some jurisdictions, declaring that something is in the public domain does not legally make it so. So, the closest available tool is to cover it by a license that grants the same usage freedoms as the public domain. See https://en.wikipedia.org/wiki/Public-domain-equivalent_license. Thanks for considering! – aomarks Nov 18 '20 at 22:33
  • Hey @byrc! Could you comment on the operations that happen after the loop, and before the 53-bit int construction (`h1 = Math.imul(h1 ^ (h1>>>16)...` and the next line). I'm curious what the purpose of those operations is, and what would be the downside of omitting those lines and directly constructing the 53-bit int from h1 and h2? – aomarks Dec 09 '20 at 02:30
  • 1
    @aomarks Those two lines represent the 'final mixing function', and is required to achieve the [Avalanche effect](https://en.wikipedia.org/wiki/Avalanche_effect). If you remove it, it becomes a statistically weaker hash. The hash output will be [less uniform (less random)](https://i.imgur.com/FXbowU0.png) in some cases, which has the consequence of slightly more collisions. Red highlighted parts in image show defects occurring when final mix is removed. – bryc Dec 09 '20 at 12:16
  • Thanks @byrc! Also, what is the tool you used to do that analysis and generate that image? Looks useful. – aomarks Dec 09 '20 at 15:42
  • @aomarks it's my own tool, a simple test for collisions and distribution. it's super messy but [here](https://jsbin.com/leqifey/1/) is an older public version. also see [SMHasher](https://github.com/rurban/smhasher), but is harder to use and needs cmake etc. Haven't yet tamed that one ;) – bryc Dec 09 '20 at 18:46
  • This is awesome, @byrc! Mind some code golfing? If we count down from the string length we’ll save a variable, and if we pre-increment (_operators beforehand_) we’ll save a copy operation. Tiny theoretical speed improvements _and_ 2 characters shorter . `TSH=s=>{for(var i=s.length,h=9;i;)h=Math.imul(h^s.charCodeAt(--i),9**9);return h^h>>>9}` – Jon Neal Dec 12 '20 at 04:01
  • @bryc can't the 9**9, in the last version, be precalculated? seems static – Odys Jan 15 '21 at 08:01
  • 1
    @Odys it can, but the point for that one is to be as short as possible, `9**9` is 4 chars and is the largest odd number achievable with ** notation. `387420489` is 9 chars. I believe modern JS engines optimize that away. At that point, you might as well use a statistically better constant tuned for your use case. Cheers. – bryc Jan 15 '21 at 14:34
  • `str` must be a string, `seed` must be a number, and the output returned from the function is a number. But what are the specs in regards to the `str` and `seed` parameters in terms of how short/long or low/high they can be? And can seed be a floating point? Etc. – Andrew Sep 23 '21 at 12:41
  • 3
    @Andrew Updated answer to clarify on `seed`. It's intended to be a 32-bit unsigned integer. So the values can be 0 to 2^32-1, and with no decimal points. So no floats; JS will just strip fractional part upon the first XOR operation anyway. There's no limit on the length of `str`. Also, it can easily be used with Array instead of String, but this question is for strings. – bryc Sep 24 '21 at 09:52
  • Not sure why this happens, but if I use a seed of 12, the function sometimes returns a 13 digit hex (not including '0x') at the front , and sometimes a 14. I am converting "number" to Hex string using ethers.js. Maybe it has something to do with length of the string to be hashed? – GGizmos Jan 09 '22 at 22:28
  • @GGizmos Not sure what you mean. What's the *exact* `string`/`seed` values that causes `cyrb53` to return 0x13/0x14? This might instead be an ethers.js issue. See if `Math.random()*2**53` also causes 0x13/0x14. – bryc Jan 10 '22 at 01:59
  • @bryc: using a seed of 12, and then outputting the string resulting from `ethers.utils.hexValue(result)`, I get the value `0x1e5ec69e41378b` (16 chars) from an input value of "a", and the value `0x978a7061b8325` when the input value is "ab" – GGizmos Jan 10 '22 at 02:19
  • @bryc: actually, a seed of 0 produces a similar result -- with different length of hash of "a" versus "ab". Anyway to modify to always produce a result of the same length? – GGizmos Jan 12 '22 at 18:02
  • @GGizmos Try the alt `toString(16)` return line in the answer which produces a fixed 64-bit hex string; perhaps you can bypass the ethers utility function which I still think is the problem here. I would still need a repro case (EXACT input arguments to reproduce the return of 0x13/0x14) to verify if its an issue with cyrb53. In general, it is extremely unlikely for cyrb53 to return such a small number. – bryc Jan 13 '22 at 12:51
  • StackOverflow indicates this code is licensed CC BY-SA 4.0, which limits its use to stuff licensed under a similar license (sharealike). Would you be willing to relicense it under a more permissive license, like the MIT license? – Suchipi May 03 '22 at 20:22
  • 1
    @bryc The numbers for these kinds of functions aren't choosen by guessing and checking via things like simulated annealing. Rather, they are based on a mathematical analysis of the underlying group/ring structure. They look really weird, but it's usually conditions like: you want prime or relatively prime values, or values that are near/far powers of small primes or whose differences...you get the idea. Basically, you do some group theory and the way these numbers relate defines your subgroup structure. – Peter Gerdes May 23 '22 at 11:10
  • 1
    @PeterGerdes Would you have some sources/reference material on how these 'constants' are chosen? MurmurHash author found his constants through a simulated annealing algorithm, which is why I mentioned it; unfortunately few authors explain the source of their constants. I personally haven't found prime multipliers to outperform quality-wise any random large odd number. But I'd be interested to know if the algorithm can be improved with different properly chosen constants. – bryc May 24 '22 at 00:54
  • Ohh I meant the primes as an example (u usually have to look at the underlying group thy) ...and sorry I meant to say "often are choosen" but I took too long editing my comment. Let me find you an example. Here's the note that 31 in java's hashcode is picked based on being a prime https://www.sitepoint.com/how-to-implement-javas-hashcode-correctly/ and here's a remark about the constant in another hash function being determined by a mathematicial formula https://stackoverflow.com/questions/24922452/what-precisely-is-the-algorithm-used-by-java-lang-objects-hashcode. – Peter Gerdes May 24 '22 at 22:40
  • Ohh here's a better explanation of where the number 2654435761 comes from in the Knuth multiplicative hash comes from (and the computation of it comes from picking a prime but the number isn't literally equal to the prime) https://stackoverflow.com/questions/11871245/knuth-multiplicative-hash – Peter Gerdes May 24 '22 at 22:43
  • @PeterGerdes Thanks. But all that really tells me is that primes are supposed to be magic. What's missing is the data to back up such choices. You mention Graph Theory but I'm not about to study 4yr Uni math just to see its application to this niche. But I somehow doubt a 32-bit fractional approximation of Phi is magically better than millions of other large odd multipliers. Anyway, I keep a [table of multipliers](https://rentry.co/f32xs) I've found around, as comparing/analyzing them could be interesting. – bryc May 25 '22 at 14:28
  • 1
    amazing, thanks. also amazing that no one commented on the brilliance of 0xdeadbeef yet, nice one. – gotjosh Jul 26 '22 at 19:13
  • Just wondering, but the javascript max. safe integer is 9007199254740991. It seems that this function could generate a number that is higher than this?? – svenema Feb 02 '23 at 16:03
  • 1
    @svenema It's designed with the max safe integer limit in mind. The default return statement will be a integer between 0 and 9007199254740991, inclusive. A 53-bit unsigned integer. To accomplish this, it combines two 32-bit numbers `h1` and `h2`, but only uses 21 bits from `h2`. The multiplication by 4294967296 is really just a left-shift for alignment purposes. The AND mask of 2097151 ensures that `h2` is a 21-bit number. It all works out. – bryc Feb 02 '23 at 18:56
196

EDIT

based on my jsperf tests, the accepted answer is actually faster: http://jsperf.com/hashcodelordvlad

ORIGINAL

if anyone is interested, here is an improved ( faster ) version, which will fail on older browsers who lack the reduce array function.

hashCode = function(s) {
  return s.split("").reduce(function(a, b) {
    a = ((a << 5) - a) + b.charCodeAt(0);
    return a & a;
  }, 0);
}
 
 // testing
 console.log(hashCode("hello."));
 console.log(hashCode("this is a text."));
 console.log(hashCode("Despacito by Luis Fonsi"));

one-liner arrow function version :

hashCode = s => s.split('').reduce((a,b)=>{a=((a<<5)-a)+b.charCodeAt(0);return a&a},0)

 // testing
 console.log(hashCode("hello."));
 console.log(hashCode("this is a text."));
 console.log(hashCode("Despacito by Luis Fonsi"));
Bibek Oli
  • 366
  • 2
  • 14
lordvlad
  • 5,200
  • 1
  • 24
  • 44
  • 7
    is there a way to get hash wich is positive number only? – Prosto Trader Nov 08 '13 at 15:26
  • @lordvlad May I use this in an open source project? I think I copied it earlier, as it uses return a|=0 rather than return a&a. – Dave Cameron Nov 22 '13 at 05:10
  • Yes, please! Always a pleasure to be of use :) just let me know what it is – lordvlad Nov 22 '13 at 08:39
  • @lordvlad I've used it in some modifications I am making to the acra-storage project. It is a backend for android crash reporting. – Dave Cameron Nov 25 '13 at 01:19
  • Faster comparing to...? – BrunoLM Dec 17 '13 at 22:33
  • 1
    @ProstoTrader Add >>>0 at the end. This function is 10 times faster than the accepted answer on both Firefox and Chrome for me. – SGr Dec 18 '13 at 15:46
  • 73
    weird. i just tested it and it turned out to be waaay slower than the accepted answer. http://jsperf.com/hashcodelordvlad – lordvlad Dec 18 '13 at 16:34
  • 194
    Good guy @lordvlad, actually testing his own answer, and then reporting when it was slower. – mikemaccana Feb 24 '14 at 13:32
  • Here's a performance comparison of three common string-to-integer hash functions: http://jsperf.com/string-hashing-methods – thdoan Feb 06 '15 at 06:19
  • 14
    I just realized: It makes perfect sense that the accepted answer is faster, because my version has to turn the string into an array first, allocating new memory and copying every character... – lordvlad Feb 22 '15 at 10:04
  • 8
    [].reduce.call(str, (p, c, i, a) => (p << 5) - p + a.charCodeAt(i), 0); – Dizzy Mar 24 '16 at 12:46
  • 1
    @SGr using `hash >>> 0` may result in an error, that the value is too large for an Int32. – RiZKiT Feb 10 '17 at 11:08
  • @lordvlad where do you get this formula from? (a<<5)-a)+b.charCodeAt(0) – abhimanyuaryan Sep 15 '18 at 22:41
  • @AbhimanyuAryan same as the accepted answer, from https://werxltd.com/wp/2010/05/13/javascript-implementation-of-javas-string-hashcode-method/ – lordvlad Sep 16 '18 at 17:08
  • Is this a one-way hash? Or it could be reversed? – turtlepower Jan 30 '19 at 22:06
  • One way. https://stackoverflow.com/questions/9406775/why-does-string-hashcode-in-java-have-many-conflicts – lordvlad Jan 31 '19 at 07:54
  • The setup and re-setup of the implicit variables and anonymous function is what kills the performance. Played with the JS Perf a minute to see if `reduce` could be viable, and it does not come close. So, I crunched `verbose` to a three line version which is on par with verbose performance. [jsperf.com](https://jsperf.com/hashcodelordvlad/83) FWIW @AbhimanyuAryan This is a 32 bit hash based on a math series s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1] (source [java.net](http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/lang/String.java#l1443)) – Bob Smith May 24 '19 at 22:26
  • I upvoted this answer for the one-liner even if it's a little slower than the accepted answer... thanks! – huug Jun 09 '20 at 07:27
  • The link in the question is broken – TechWisdom Nov 14 '20 at 13:07
  • I have no idea why "modern" jsdevs think reduce, map and friends are faster. Do you not realize that these method each create a copy of the arrray? – Tomáš Zato Nov 27 '20 at 13:23
  • How precise is this? I'm using it to difference audio files from eachother – FooBar Dec 13 '20 at 21:15
  • Just tested: Accepted esmiralha answer x10.000 runs: ~380ms. lordVlad's one-liner x10.000 runs: ~160ms. Dizzy's one-liner x10.000 runs: ~1300ms. – cape_bsas Jun 04 '21 at 18:01
  • Here's the same code, but even [slightly] more compressed (and as a regular function): `function hash(s){return s.split("").reduce((a,b)=>(a=(a<<5)-a+b.charCodeAt(0))&a,0)}` – ashleedawg Nov 01 '21 at 09:56
139

Note: Even with the best 32-bit hash, collisions will occur sooner or later.

The hash collision probability can be calculated as 1 - e ^ (-k(k-1) / 2N, approximated as k^2 / 2N (see here). This may be higher than intuition suggests:
Assuming a 32-bit hash and k=10,000 items, a collision will occur with a probability of 1.2%. For 77,163 samples the probability becomes 50%! (calculator).
I suggest a workaround at the bottom.

In an answer to this question Which hashing algorithm is best for uniqueness and speed?, Ian Boyd posted a good in depth analysis. In short (as I interpret it), he comes to the conclusion that MurmurHash is best, followed by FNV-1a.
Java’s String.hashCode() algorithm that esmiralha proposed seems to be a variant of DJB2.

  • FNV-1a has a a better distribution than DJB2, but is slower
  • DJB2 is faster than FNV-1a, but tends to yield more collisions
  • MurmurHash3 is better and faster than DJB2 and FNV-1a (but the optimized implementation requires more lines of code than FNV and DJB2)

Some benchmarks with large input strings here: http://jsperf.com/32-bit-hash
When short input strings are hashed, murmur's performance drops, relative to DJ2B and FNV-1a: http://jsperf.com/32-bit-hash/3

So in general I would recommend murmur3.
See here for a JavaScript implementation: https://github.com/garycourt/murmurhash-js

If input strings are short and performance is more important than distribution quality, use DJB2 (as proposed by the accepted answer by esmiralha).

If quality and small code size are more important than speed, I use this implementation of FNV-1a (based on this code).

/**
 * Calculate a 32 bit FNV-1a hash
 * Found here: https://gist.github.com/vaiorabbit/5657561
 * Ref.: http://isthe.com/chongo/tech/comp/fnv/
 *
 * @param {string} str the input value
 * @param {boolean} [asString=false] set to true to return the hash value as 
 *     8-digit hex string instead of an integer
 * @param {integer} [seed] optionally pass the hash of the previous chunk
 * @returns {integer | string}
 */
function hashFnv32a(str, asString, seed) {
    /*jshint bitwise:false */
    var i, l,
        hval = (seed === undefined) ? 0x811c9dc5 : seed;

    for (i = 0, l = str.length; i < l; i++) {
        hval ^= str.charCodeAt(i);
        hval += (hval << 1) + (hval << 4) + (hval << 7) + (hval << 8) + (hval << 24);
    }
    if( asString ){
        // Convert to 8 digit hex string
        return ("0000000" + (hval >>> 0).toString(16)).substr(-8);
    }
    return hval >>> 0;
}

Improve Collision Probability

As explained here, we can extend the hash bit size using this trick:

function hash64(str) {
    var h1 = hash32(str);  // returns 32 bit (as 8 byte hex string)
    return h1 + hash32(h1 + str);  // 64 bit (as 16 byte hex string)
}

Use it with care and don't expect too much though.

Yves M.
  • 29,855
  • 23
  • 108
  • 144
mar10
  • 14,320
  • 5
  • 39
  • 64
  • Why do you do `("0000000" + (hval >>> 0).toString(16)).substr(-8);`? Isn't that the same as `(hval >>> 0).toString(16)`? – Manuel Meurer Oct 22 '14 at 14:14
  • 4
    this add leading '0's so that the resulting hash is always 8 chars long. Easier to read and recognize in outputs, but that's my personal opinion – mar10 Oct 22 '14 at 14:53
  • Ah ok, I get it. For small `hval`, `(hval >>> 0).toString(16)` might be less than 8 characters, so you pad it with zeros. I was just confused because `(hval >>> 0).toString(16)` always resulted in a exactly 8 character string for me. – Manuel Meurer Oct 22 '14 at 15:02
  • Sweet! I was struggling to find out how to XOR (^) in JS because it was returning negative numbers. This thing you did with hval did the trick. Now I have an algorithm that returns the same hash in C# as in JS. I will post them both below. Thanks! – djabraham May 04 '15 at 18:25
  • 3
    I love this answer because it produces a sooo much better distributed hash: other functions proposed here will make consequent hash values. Eg `hash("example1") - hash("example2") == 1", while this one is much more unpredictable. – GavinoGrifoni Oct 07 '16 at 06:26
  • At which length it is recommended to use murmur3? I'm planning to decide which hash to use depending on the string length, so it would be fast for short strings and secure for large strings, but I'm not sure if there would be collisions between the two algorithms, is this a good idea? – David Feb 14 '17 at 09:44
  • 1
    In response to "FNV-1a has a a better distribution than DJB2, but is slower" - I think it should be said that FNV1a can be extremely fast when implemented using the ES6 `Math.imul` function. That alone makes it top benchmarks, and ultimately a better choice than DJB2 in the long run. – bryc Jan 10 '19 at 00:53
  • Came here just to say that it is indeed way faster using `Math.imul`: [https://jsperf.com/fowler-noll-vo-hash-function-implementations](https://jsperf.com/fowler-noll-vo-hash-function-implementations) – stefanmaric Mar 24 '20 at 23:23
  • Perfecto. Thanks for sharing ! – Shiva Avula Jun 05 '20 at 15:07
  • @stefanmaric jsperf link is broken, mind sharing a different link? – bruh Mar 30 '21 at 21:21
  • @bruh so JSPerf is in general out of service. I don't have the specific test suite I used at JSPerf but created a new one here that should be similar: https://jsbench.me/vbknu74f3e – stefanmaric Apr 23 '21 at 11:14
  • Here's the benchmark I didn't show for my previous comment: https://jsbench.me/8mjqpwyesk :) – bryc Apr 24 '21 at 19:11
100

Based on accepted answer in ES6. Smaller, maintainable and works in modern browsers.

function hashCode(str) {
  return str.split('').reduce((prevHash, currVal) =>
    (((prevHash << 5) - prevHash) + currVal.charCodeAt(0))|0, 0);
}

// Test
console.log("hashCode(\"Hello!\"): ", hashCode('Hello!'));

EDIT (2019-11-04):

one-liner arrow function version :

const hashCode = s => s.split('').reduce((a,b) => (((a << 5) - a) + b.charCodeAt(0))|0, 0)

// test
console.log(hashCode('Hello!'))
vdegenne
  • 12,272
  • 14
  • 80
  • 106
Deekshith
  • 1,544
  • 2
  • 14
  • 15
  • 1
    Thanks for sharing I added `str += ""` before hashing to avoid exception `str.split is not a function` thrown when non-strings were passed as parameters – BeetleJuice Jul 11 '16 at 10:17
  • 8
    But much, much slower than any of these: [https://jsperf.com/hashing-strings](https://jsperf.com/hashing-strings) – AndyO Apr 27 '17 at 09:34
  • 1
    I also just noticed that the fastest "retro" solution is actually smaller as well if you remove the line feeds so that it's only 3 lines long. – AndyO Apr 27 '17 at 09:47
  • @AndyO I dont see this code included in the jsperf link. – Deekshith Apr 28 '17 at 15:44
  • Dude, really? a) It takes only a few minutes to setup your own benchmark with Benchmark.js (which is what I did) b) Do you think I can't see the test case you created here? Which is "94% slower"? https://jsperf.com/hashing-strings/56 – AndyO May 02 '17 at 08:18
  • Pardon me, I got you wrong. I thought you posted the link which has this benchmark and I just pointed out that I was not able to see that. Thanks for reminding me of the test I created an year ago. Yes this was not the fastest back then but I was going with an assumption that ES6 was new and browser implementations of these new features is not optimized. But here we go even after an year, there is still no significant improvement in the performance. This debunks my assumption (?). I am now guessing that the functional primitives like `reduce` are inherently slower than a simple `for` loop. – Deekshith May 02 '17 at 14:44
  • 4
    Any way to have this produce only positive but still unique results? – Dids Nov 06 '17 at 13:01
  • 2
    @BeetleJuice The more appropriate question is if you have a function that is designed to take a string then why is your program sending it a non-string in the first place? Perhaps the error is a sign that the caller is doing strange things. Food for thought. – Sukima Mar 29 '18 at 13:25
  • 4
    @deekshith The accepted answer uses `hash |= 0` to convert to an 32 bit int. This implementation does not. Is this a bug? – Sukima Mar 29 '18 at 14:35
  • 1
    @Sukima Thanks for pointing out. No idea how I missed it. Fixed it just now :) – Deekshith Apr 01 '18 at 04:35
  • I'm getting a negative number as a result from this when using JSON.stringify on an object. How can I adjust this to make sure that I always get a positive number for the hash? – CWSites May 29 '19 at 19:59
  • 1
    Warning! this is a bad hashing method, it does not really "hash": `hashCode("hello")` (99162322) has the same first digits than `hashCode("hellp")` (99162323); this is typically what we *don't* want from a hash function... – Basj Jun 09 '20 at 15:07
  • One way to obtain non-negative but unique results would be to add 2^31. `+ Math.pow(2, 31)` – Chris Feb 05 '21 at 13:34
  • Do you actually believe that this version is more maintainable? – tleb Feb 10 '21 at 15:37
59

I'm a bit surprised nobody has talked about the new SubtleCrypto API yet.

To get an hash from a string, you can use the subtle.digest method :

function getHash(str, algo = "SHA-256") {
  let strBuf = new TextEncoder().encode(str);
  return crypto.subtle.digest(algo, strBuf)
    .then(hash => {
      window.hash = hash;
      // here hash is an arrayBuffer, 
      // so we'll connvert it to its hex version
      let result = '';
      const view = new DataView(hash);
      for (let i = 0; i < hash.byteLength; i += 4) {
        result += ('00000000' + view.getUint32(i).toString(16)).slice(-8);
      }
      return result;
    });
}

getHash('hello world')
  .then(hash => {
    console.log(hash);
  });
Klesun
  • 12,280
  • 5
  • 59
  • 52
Kaiido
  • 123,334
  • 13
  • 219
  • 285
  • 5
    I agree. The conversion to hex could be done a little different... `var promise = crypto.subtle.digest({name: "SHA-256"}, Uint8Array.from(data)); promise.then(function(result){ console.log(Array.prototype.map.call(new Uint8Array(result), x => x.toString(16).padStart(2, '0')).join('')); });` – Denis Giffeler Oct 03 '17 at 15:01
  • 7
    A cryptographic hash function for strings is a bit overkill.. `crypto` isn't exactly performant. – bryc Oct 03 '18 at 22:43
  • 1
    Reliable quality random without having to rely on people running tests, built-in (no custom implementation needed), seedable, and I only needed a few hundred numbers for generating a game map, this seemed perfect. But it turns out there is absolutely no way to do it synchronously. Having to provide some async callback thingy every time you call your seeded random engine makes the code super unreadable and look ridiculous. I don't get who came up with this crappy crypto.subtle interface, so I in the end I had to go with xmur3+sfc32 from this answer: https://stackoverflow.com/a/47593316/1201863 – Luc May 01 '20 at 17:49
  • I'm confused about the line `result += ('00000000' + view.getUint32(i).toString(16)).slice(-8);` What is the point of adding 8 zeroes `00000000` and then slicing them off at the end? `.slice(-8)`? When I remove those parts I get the exact same result (in my very limited amount of testing). What am I missing? – I0_ol May 07 '22 at 08:07
  • @I0_ol this is the equivalent of `x.toString(16).padStart(2, '0')` from Denis's comment. `.slice(-8)` will remove the ones before that aren't used. – Kaiido May 07 '22 at 10:33
  • @Kaiido What I'm saying is that I get the same result without adding 0's or slicing them off. Basically this `result += ('00000000' + view.getUint32(i).toString(16)).slice(-8);` gives the same result as `result += view.getUint32(i).toString(16);` – I0_ol May 07 '22 at 11:58
  • 1
    @I0_ol because you always got a value higher than 0xFF0000 Simply try with `view = new DataView(new Uint32Array([0]).buffer)`. You'll get `"0"` without the padding and `"00000000"` with. – Kaiido May 07 '22 at 13:38
44

This is a refined and better performing variant, and matches Java's implementation of the standard object.hashCode() for CharSequence.

String.prototype.hashCode = function() {
    var hash = 0, i = 0, len = this.length;
    while ( i < len ) {
        hash  = ((hash << 5) - hash + this.charCodeAt(i++)) << 0;
    }
    return hash;
};

Here is also one that returns only positive hashcodes:

String.prototype.hashcode = function() {
    return this.hashCode()+ 2147483647 + 1;
};

And here is a matching one for Java that only returns positive hashcodes:

public static long hashcode(Object obj) {
    return ((long) obj.hashCode()) + Integer.MAX_VALUE + 1l;
}

Without prototype for those that do not want to attach it to String :

function hashCode(str) {
    var hash = 0, i = 0, len = str.length;
    while ( i < len ) {
        hash  = ((hash << 5) - hash + str.charCodeAt(i++)) << 0;
    }
    return hash;
}

function hashcode(str) {
    hashCode(str) + 2147483647 + 1;
}

Enjoy!

mjs
  • 21,431
  • 31
  • 118
  • 200
30

If it helps anyone, I combined the top two answers into an older-browser-tolerant version, which uses the fast version if reduce is available and falls back to esmiralha's solution if it's not.

/**
 * @see http://stackoverflow.com/q/7616461/940217
 * @return {number}
 */
String.prototype.hashCode = function(){
    if (Array.prototype.reduce){
        return this.split("").reduce(function(a,b){a=((a<<5)-a)+b.charCodeAt(0);return a&a},0);              
    } 
    var hash = 0;
    if (this.length === 0) return hash;
    for (var i = 0; i < this.length; i++) {
        var character  = this.charCodeAt(i);
        hash  = ((hash<<5)-hash)+character;
        hash = hash & hash; // Convert to 32bit integer
    }
    return hash;
}

Usage is like:

var hash = "some string to be hashed".hashCode();
FZs
  • 16,581
  • 13
  • 41
  • 50
Kyle Falconer
  • 8,302
  • 6
  • 48
  • 68
  • how to optimize this code to run faster in every browser. `String.prototype.hashCode = function(){ var hash = 5381; if (this.length === 0) return hash; for (var i = 0; i < this.length; i++) { var character = this.charCodeAt(i); hash = ((hash<<5)+hash)^character; // Convert to 32bit integer } return hash; }` – Musakkhir Sayyed Jun 24 '15 at 09:54
  • What is the purpose of this part: `return a & a`? Shouldn't that just return a regardless? – James Stewart Feb 25 '22 at 10:01
  • Not sure what you by regardless mean and I'm not expert on this stuff but this is a bitwise operator and is doing some math magic hope that helps somehow – Can Rau Jul 09 '22 at 02:53
20

UUID v3 and UUID v5 actually are hashes for a given input string.

  • UUID v3 is based on MD5,
  • UUID v5 is based on SHA-1.

So, the most obvious choice would be to go for UUID v5.

Fortunately, there is a popular npm package, which includes all UUID algorithms.

npm install uuid

To actually generate a UUID v5, you need a unique namespace. This namespace acts like a seed, and should be a constant, to assure that for a given input the output will always be the same. Ironically, you should generate a UUID v4 as a namespace. And the easiest way to do so, is using some online tool.

Once you've got a namespace, you're all set.

import { v5 as uuidv5 } from 'uuid';

const MY_NAMESPACE = '1b671a64-40d5-491e-99b0-da01ff1f3341';
const hash = uuidv5('input', MY_NAMESPACE);

If your input string will always be an URL for instance, then there are some default namespaces which you can use.

const hashForURL = uuidv5('https://www.w3.org/', uuidv5.URL);
bvdb
  • 22,839
  • 10
  • 110
  • 123
  • 1
    Prefer this answer because it calls a library. Wonder if there's a way to make the generated ID shorter, say 10 chars? I looked up NanoID but it seems to lack the option to take an extra argument. – Yan King Yin Dec 23 '22 at 02:40
  • 1
    @YanKingYin part of the reason why it's so long, is to guarantee uniqueness. Which is an advantage of this answer. ;;;; However, uniqueness isn't always a requirement for a hash-code. If you don't need uniqueness, there's plenty of other algorithms that can give you shorter hashes. _(e.g. indexing algorithms sometimes use hashes that aren't unique. A query can then result in false-positive hits, which will be filtered by a strict compare afterwards.) – bvdb Dec 23 '22 at 16:10
14

I'm kinda late to the party, but you can use this module: crypto:

const crypto = require('crypto');

const SALT = '$ome$alt';

function generateHash(pass) {
  return crypto.createHmac('sha256', SALT)
    .update(pass)
    .digest('hex');
}

The result of this function is always is 64 characters string; something like this: "aa54e7563b1964037849528e7ba068eb7767b1fab74a8d80fe300828b996714a"

Ariel Jiménez
  • 199
  • 2
  • 6
13

Here is a compact ES6 friendly readable snippet

const stringHashCode = str => {
  let hash = 0
  for (let i = 0; i < str.length; ++i)
    hash = Math.imul(31, hash) + str.charCodeAt(i)

  return hash | 0
}
amirhe
  • 2,186
  • 1
  • 13
  • 27
8

My quick (very long) one liner based on FNV's Multiply+Xor method:

my_string.split('').map(v=>v.charCodeAt(0)).reduce((a,v)=>a+((a<<7)+(a<<3))^v).toString(16);
John Smith
  • 570
  • 2
  • 7
  • 17
  • That's awesome, exactly what I was looking for, to generate a small hash based on a larger string. I needed to scroll so much for this solution, but was worth it!!! – Tiberiu Popescu Jun 23 '20 at 06:46
7

Thanks to the example by mar10, I found a way to get the same results in C# AND Javascript for an FNV-1a. If unicode chars are present, the upper portion is discarded for the sake of performance. Don't know why it would be helpful to maintain those when hashing, as am only hashing url paths for now.

C# Version

private static readonly UInt32 FNV_OFFSET_32 = 0x811c9dc5;   // 2166136261
private static readonly UInt32 FNV_PRIME_32 = 0x1000193;     // 16777619

// Unsigned 32bit integer FNV-1a
public static UInt32 HashFnv32u(this string s)
{
    // byte[] arr = Encoding.UTF8.GetBytes(s);      // 8 bit expanded unicode array
    char[] arr = s.ToCharArray();                   // 16 bit unicode is native .net 

    UInt32 hash = FNV_OFFSET_32;
    for (var i = 0; i < s.Length; i++)
    {
        // Strips unicode bits, only the lower 8 bits of the values are used
        hash = hash ^ unchecked((byte)(arr[i] & 0xFF));
        hash = hash * FNV_PRIME_32;
    }
    return hash;
}

// Signed hash for storing in SQL Server
public static Int32 HashFnv32s(this string s)
{
    return unchecked((int)s.HashFnv32u());
}

JavaScript Version

var utils = utils || {};

utils.FNV_OFFSET_32 = 0x811c9dc5;

utils.hashFnv32a = function (input) {
    var hval = utils.FNV_OFFSET_32;

    // Strips unicode bits, only the lower 8 bits of the values are used
    for (var i = 0; i < input.length; i++) {
        hval = hval ^ (input.charCodeAt(i) & 0xFF);
        hval += (hval << 1) + (hval << 4) + (hval << 7) + (hval << 8) + (hval << 24);
    }

    return hval >>> 0;
}

utils.toHex = function (val) {
    return ("0000000" + (val >>> 0).toString(16)).substr(-8);
}
djabraham
  • 766
  • 11
  • 20
  • @mathiasrw It's possible for Unicode characters to exceed 8 bits in memory, so I assume the 0xFF simply masks off anything outside that range. See more about charCodeAt() here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/charCodeAt – djabraham Jul 25 '17 at 17:41
  • If ES6 is available (all modern engines support it), `Math.imul` can be used for the multiplication step, which [greatly improves performance](https://jsbench.me/8mjqpwyesk/1). Only issue is, it won't work in IE11 without a [shim](https://github.com/paulmillr/es6-shim/blob/master/es6-shim.min.js). – bryc Jan 10 '19 at 01:18
6

A fast and concise one which was adapted from here:

String.prototype.hashCode = function() {
  var hash = 5381, i = this.length
  while(i)
    hash = (hash * 33) ^ this.charCodeAt(--i)
  return hash >>> 0;
}
wybe
  • 615
  • 5
  • 14
soulmachine
  • 3,917
  • 4
  • 46
  • 56
6

SubtleCrypto.digest

I’m not utilizing a server-side language so I can’t do it that way.

Are you sure you can’t do it that way?

Did you forget you’re using Javascript, the language ever-evolving?

Try SubtleCrypto. It supports SHA-1, SHA-128, SHA-256, and SHA-512 hash functions.


async function hash(message/*: string */) {
 const text_encoder = new TextEncoder;
 const data = text_encoder.encode(message);
 const message_digest = await window.crypto.subtle.digest("SHA-512", data);
 return message_digest;
} // -> ArrayBuffer

function in_hex(data/*: ArrayBuffer */) {
 const octets = new Uint8Array(data);
 const hex = [].map.call(octets, octet => octet.toString(16).padStart(2, "0")).join("");
 return hex;
} // -> string

(async function demo() {
 console.log(in_hex(await hash("Thanks for the magic.")));
})();
5

I needed a similar function (but different) to generate a unique-ish ID based on the username and current time. So:

window.newId = ->
  # create a number based on the username
  unless window.userNumber?
    window.userNumber = 0
  for c,i in window.MyNamespace.userName
    char = window.MyNamespace.userName.charCodeAt(i)
    window.MyNamespace.userNumber+=char
  ((window.MyNamespace.userNumber + Math.floor(Math.random() * 1e15) + new Date().getMilliseconds()).toString(36)).toUpperCase()

Produces:

2DVFXJGEKL
6IZPAKFQFL
ORGOENVMG
... etc 

edit Jul 2022: As @canRau points out the authors of shortid prefer nanoid now https://github.com/ai/nanoid/

jcollum
  • 43,623
  • 55
  • 191
  • 321
  • 2
    @t0r0X well now I use a module called shortid: https://www.npmjs.com/package/shortid – jcollum Jun 23 '15 at 17:31
  • 1
    How are you using the username with shortid? It just seems to generate ids but I don't see how you are using to generate a hash from a string – cyberwombat Sep 12 '16 at 22:14
  • 1
    This answer has 3 downvotes. For the life of me I can't imagine why. No one has said anything... :-/ – jcollum Feb 16 '17 at 17:18
  • 1
    @jcollum that's why i almost never answer stale questions.. hit and runs go unnoticed. even after you fix up the answer, no one comes along to balance it out. – bryc Dec 04 '19 at 01:15
  • authors of shortid now recommend their improved version https://github.com/ai/nanoid/ – Can Rau Jul 09 '22 at 03:06
5

The Jenkins One at a Time Hash is quite nice:

//Credits (modified code): Bob Jenkins (http://www.burtleburtle.net/bob/hash/doobs.html)
//See also: https://en.wikipedia.org/wiki/Jenkins_hash_function
//Takes a string of any size and returns an avalanching hash string of 8 hex characters.
function jenkinsOneAtATimeHash(keyString)
{
  let hash = 0;
  for (charIndex = 0; charIndex < keyString.length; ++charIndex)
  {
    hash += keyString.charCodeAt(charIndex);
    hash += hash << 10;
    hash ^= hash >> 6;
  }
  hash += hash << 3;
  hash ^= hash >> 11;
  //4,294,967,295 is FFFFFFFF, the maximum 32 bit unsigned integer value, used here as a mask.
  return (((hash + (hash << 15)) & 4294967295) >>> 0).toString(16)
};

Examples:

jenkinsOneAtATimeHash('test')
"31c25ec1"
jenkinsOneAtATimeHash('a')
"ca2e9442"
jenkinsOneAtATimeHash('0')
"6e3c5c6b"

You can also remove the .toString(16) part at the end to generate numbers:

jenkinsOneAtATimeHash2('test')
834821825
jenkinsOneAtATimeHash2('a')
3392050242
jenkinsOneAtATimeHash2('0')
1849449579

Note that if you do not need to hash a string or key specifically, but just need a hash generated out of thin air, you can use:

window.crypto.getRandomValues(new Uint32Array(1))[0].toString(16)

Examples:

window.crypto.getRandomValues(new Uint32Array(1))[0].toString(16)
"6ba9ea7"
window.crypto.getRandomValues(new Uint32Array(1))[0].toString(16)
"13fe7edf"
window.crypto.getRandomValues(new Uint32Array(1))[0].toString(16)
"971ffed4"

and the same as above, you can remove the `.toString(16) part at the end to generate numbers:

window.crypto.getRandomValues(new Uint32Array(1))[0]
1154752776
window.crypto.getRandomValues(new Uint32Array(1))[0]
3420298692
window.crypto.getRandomValues(new Uint32Array(1))[0]
1781389127

Note: You can also generate multiple values at once with this method, e.g.:

window.crypto.getRandomValues(new Uint32Array(3))
Uint32Array(3) [ 2050530949, 3280127172, 3001752815 ]
Andrew
  • 5,839
  • 1
  • 51
  • 72
3

If you want to avoid collisions you may want to use a secure hash like SHA-256. There are several JavaScript SHA-256 implementations.

I wrote tests to compare several hash implementations, see https://github.com/brillout/test-javascript-hash-implementations.

Or go to http://brillout.github.io/test-javascript-hash-implementations/, to run the tests.

brillout
  • 7,804
  • 11
  • 72
  • 84
  • 2
    Using a secure cryptographic hash can be extremely slow. Avoiding collisions is a product of the bit width, not security. 128 bit non-cryptographic hash or even 64 bits should be more than enough for most purposes. [MurmurHash3_x86_128](https://github.com/bryc/code/blob/master/jshash/hashes/murmurhash3_128.js) is quite fast and has a very low chance of collisions. – bryc Dec 09 '18 at 03:44
3

I do not see any reason to use this overcomplicated crypto code instead of ready-to-use solutions, like object-hash library, or etc. relying on vendor is more productive, saves time and reduces maintenance cost.

Just use https://github.com/puleos/object-hash

var hash = require('object-hash');

hash({foo: 'bar'}) // => '67b69634f9880a282c14a0f0cb7ba20cf5d677e9'
hash([1, 2, 2.718, 3.14159]) // => '136b9b88375971dff9f1af09d7356e3e04281951'
Oleg Abrazhaev
  • 2,751
  • 2
  • 28
  • 41
  • 3
    The source code of that lib isn't even readable.. just 50k of minified code. – bryc Feb 17 '19 at 15:05
  • 1
    @bryc that's how vendor code supposed to look like :) and for sources you can check https://github.com/puleos/object-hash/blob/master/index.js – Oleg Abrazhaev Feb 19 '19 at 08:10
  • 1
    Minified code is 35.4 KB while the full source is 14.2 KB? That makes no sense. – bryc Mar 19 '19 at 16:35
  • 2
    @bryc have you considered this line? `var crypto = require('crypto');`. I think it adds this dependency code from the vendor in the minified version during a build. – Oleg Abrazhaev Mar 20 '19 at 18:00
  • 1
    If you really need to hash Objects, I wrote [any-serialize](https://github.com/patarapolw/any-serialize) to serialize ANY Object with sort keys, then [cyrb53](https://stackoverflow.com/a/52171480/9023855) to generate base36 hash. – Polv Feb 22 '20 at 06:27
  • Wonder if there's any option to make the resulting hash string *shorter*, say 10 chars? – Yan King Yin Dec 23 '22 at 02:44
  • @YanKingYin just add this `.slice(0, 10)` – Oleg Abrazhaev Dec 27 '22 at 14:22
2

I have combined the two solutions (users esmiralha and lordvlad) to get a function that should be faster for browsers that support the js function reduce() and still compatible with old browsers:

String.prototype.hashCode = function() {

    if (Array.prototype.reduce) {
        return this.split("").reduce(function(a,b){a=((a<<5)-a)+b.charCodeAt(0);return a&a},0);   
    } else {

        var hash = 0, i, chr, len;
        if (this.length == 0) return hash;
        for (i = 0, len = this.length; i < len; i++) {
        chr   = this.charCodeAt(i);
        hash  = ((hash << 5) - hash) + chr;
        hash |= 0; // Convert to 32bit integer
        }
        return hash;
    }
};

Example:

my_string = 'xyz';
my_string.hashCode();
Frank
  • 809
  • 1
  • 10
  • 12
2

I went for a simple concatenation of char codes converted to hex strings. This serves a relatively narrow purpose, namely just needing a hash representation of a SHORT string (e.g. titles, tags) to be exchanged with a server side that for not relevant reasons can't easily implement the accepted hashCode Java port. Obviously no security application here.

String.prototype.hash = function() {
  var self = this, range = Array(this.length);
  for(var i = 0; i < this.length; i++) {
    range[i] = i;
  }
  return Array.prototype.map.call(range, function(i) {
    return self.charCodeAt(i).toString(16);
  }).join('');
}

This can be made more terse and browser-tolerant with Underscore. Example:

"Lorem Ipsum".hash()
"4c6f72656d20497073756d"

I suppose if you wanted to hash larger strings in similar fashion you could just reduce the char codes and hexify the resulting sum rather than concatenate the individual characters together:

String.prototype.hashLarge = function() {
  var self = this, range = Array(this.length);
  for(var i = 0; i < this.length; i++) {
    range[i] = i;
  }
  return Array.prototype.reduce.call(range, function(sum, i) {
    return sum + self.charCodeAt(i);
  }, 0).toString(16);
}

'One time, I hired a monkey to take notes for me in class. I would just sit back with my mind completely blank while the monkey scribbled on little pieces of paper. At the end of the week, the teacher said, "Class, I want you to write a paper using your notes." So I wrote a paper that said, "Hello! My name is Bingo! I like to climb on things! Can I have a banana? Eek, eek!" I got an F. When I told my mom about it, she said, "I told you, never trust a monkey!"'.hashLarge()
"9ce7"

Naturally more risk of collision with this method, though you could fiddle with the arithmetic in the reduce however you wanted to diversify and lengthen the hash.

swornabsent
  • 984
  • 1
  • 7
  • 14
2

Adding this because nobody did yet, and this seems to be asked for and implemented a lot with hashes, but it's always done very poorly...

This takes a string input, and a maximum number you want the hash to equal, and produces a unique number based on the string input.

You can use this to produce a unique index into an array of images (If you want to return a specific avatar for a user, chosen at random, but also chosen based on their name, so it will always be assigned to someone with that name).

You can also use this, of course, to return an index into an array of colors, like for generating unique avatar background colors based on someone's name.

function hashInt (str, max = 1000) {
    var hash = 0;
    for (var i = 0; i < str.length; i++) {
      hash = ((hash << 5) - hash) + str.charCodeAt(i);
      hash = hash & hash;
    }
    return Math.round(max * Math.abs(hash) / 2147483648);
}
Nick Steele
  • 7,419
  • 4
  • 36
  • 33
2

This generates a consistent hash based on any number of params passed in:

/**
 * Generates a hash from params passed in
 * @returns {string} hash based on params
 */
function fastHashParams() {
    var args = Array.prototype.slice.call(arguments).join('|');
    var hash = 0;
    if (args.length == 0) {
        return hash;
    }
    for (var i = 0; i < args.length; i++) {
        var char = args.charCodeAt(i);
        hash = ((hash << 5) - hash) + char;
        hash = hash & hash; // Convert to 32bit integer
    }
    return String(hash);
}

fastHashParams('hello world') outputs "990433808"

fastHashParams('this',1,'has','lots','of','params',true) outputs "1465480334"

John Doherty
  • 3,669
  • 36
  • 38
1

Slightly simplified version of @esmiralha's answer.

I don't override String in this version, since that could result in some undesired behaviour.

function hashCode(str) {
    var hash = 0;
    for (var i = 0; i < str.length; i++) {
        hash = ~~(((hash << 5) - hash) + str.charCodeAt(i));
    }
    return hash;
}
crazy2be
  • 2,134
  • 1
  • 21
  • 17
0

Another example use of SubtleCrypto.Digest():

I wanted to generate a random color from the hash of a piece of text:

const text = "some random piece of text";

crypto.subtle.digest('SHA-1', new TextEncoder().encode(text)).then(hash => {
  let hue = new DataView(hash).getUint32(0) % 360;
  let random_rgb = `hsl(${hue}, 100%, 50%)`;

  document.getElementById("bar").style.backgroundColor = random_rgb;
});
<div id="bar" style="width: 100px; height: 100px">
</div>
Peter Frost
  • 351
  • 3
  • 6
-1

stable-hash

A tiny and fast (481b unpkg) lib for "stably hashing" a JavaScript value. Originally created for SWR.

import hash from 'stable-hash'

hash(anyJavaScriptValueHere) // returns a string
Mechanic
  • 5,015
  • 4
  • 15
  • 38