160

I'm trying to create a websocket server written in node.js

To get the server to work I need to get the SHA1 hash of a string.

What I have to do is explained in Section 5.2.2 page 35 of the docs.

NOTE: As an example, if the value of the "Sec-WebSocket-Key" header in the client's handshake were "dGhlIHNhbXBsZSBub25jZQ==", the server would append thestring "258EAFA5-E914-47DA-95CA-C5AB0DC85B11" to form the string "dGhlIHNhbXBsZSBub25jZQ==258EAFA5-E914-47DA-95CA-C5AB0DC85B11". The server would then take the SHA-1 hash of this string, giving the value 0xb3 0x7a 0x4f 0x2c 0xc0 0x62 0x4f 0x16 0x90 0xf6 0x46 0x06 0xcf 0x38 0x59 0x45 0xb2 0xbe 0xc4 0xea. This value is then base64-encoded, to give the value "s3pPLMBiTxaQ9kYGzzhZRbK+xOo=", which would be returned in the "Sec-WebSocket-Accept" header.

Community
  • 1
  • 1
Eric
  • 2,886
  • 3
  • 20
  • 19
  • 9
    I would *highly* recommend using the excellent http://socket.io/ library instead of rolling your own. Not only has this been extensively tested and patched, but it supports most browsers (event those without the WebSocket API) through various methods. – Alex Turpin Aug 08 '11 at 15:07
  • 1
    A good reference for the future visitors: http://stackoverflow.com/questions/9407892/how-to-generate-random-sha1-hash-to-use-as-id-in-node-js – Damodaran Nov 13 '13 at 12:34

6 Answers6

341

See the crypto.createHash() function and the associated hash.update() and hash.digest() functions:

var crypto = require('crypto')
var shasum = crypto.createHash('sha1')
shasum.update('foo')
shasum.digest('hex') // => "0beec7b5ea3f0fdbc95d0dd47f3c5bc275da8a33"
maerics
  • 151,642
  • 46
  • 269
  • 291
  • 1
    If you look at the example at https://tools.ietf.org/html/draft-ietf-hybi-thewebsocketprotocol-10#page-36 the expected base64 value is "s3pPLMBiTxaQ9kYGzzhZRbK+xOo=" . To gett to that you should omit the encoding argument to shasum,digest() so you get back a buffer (not a string). You can then call that buffer's .toString('base64') and you get the expected answer. – Panu Logic Apr 28 '21 at 22:13
59

Obligatory: SHA1 is broken, you can compute SHA1 collisions for 45,000 USD (and even less since this answer was written). You should use sha256:

var getSHA256ofJSON = function(input){
    return crypto.createHash('sha256').update(JSON.stringify(input)).digest('hex')
}

To answer your question and make a SHA1 hash:

const INSECURE_ALGORITHM = 'sha1'
var getInsecureSHA1ofJSON = function(input){
    return crypto.createHash(INSECURE_ALGORITHM).update(JSON.stringify(input)).digest('hex')
}

Then:

getSHA256ofJSON('whatever')

or

getSHA256ofJSON(['whatever'])

or

getSHA256ofJSON({'this':'too'})

Official node docs on crypto.createHash()

mikemaccana
  • 110,530
  • 99
  • 389
  • 494
  • 9
    Good idea. Note, however, that all objects (except arrays and null) will have the same sha1sum value since `Object.toString()` returns `[object Object]` by default. So `sha1sum({})` === `sha1sum({"foo":"bar"})` === `sha1sum({"a":1})`, etc. – maerics Jun 04 '15 at 21:40
  • sha1(JSON.stringify("some string")) => sha1("\"some string\"") which is absolutely not expected and not cross platform. Sometimes better is the enemy of good. – Pierre Mar 18 '16 at 16:39
  • 5
    sha1 of a given string is expected to be the same on any platform. Your implementation using JSON.stringify is altering the original string and sha1sum("abcd") gives f805c8fb0d5c466362ce9f0dc798bd5b3b32d512 where anyone would expect 81fe8bfe87576c3ecb22426f8e57847382917acf – Pierre Mar 23 '16 at 07:55
  • 2
    @Pierre That's an excellent point. I think naming the function `sha1sum` is inaccurate given what you've said - this plainly does more than what a normal sha1 would. I've renamed the function in the answer. – mikemaccana Mar 23 '16 at 13:42
  • There is as of today no known collision for the standard 80-round SHA-1 according to http://stackoverflow.com/a/3476791/1236215 – kzahel Aug 12 '16 at 04:39
  • @kzahel As you can see from that link, this is no longer the case. This was inevitable back in 2015 when SHA1 began looking weak, based on the prior experiences we've had with MD5, MD4, crypt/DES etc. – mikemaccana Oct 23 '17 at 12:28
  • 3
    Since the haveibeenpwned.com API expects SHA1-hashes, there are perfectly valid reasons to use it even nowadays. But thanks for the answer nevertheless! – NotX Jan 29 '21 at 16:32
  • It has got to be better that CRC16. $45 is a bit much. Check the Graviton2 on AWS. – mckenzm Nov 04 '21 at 07:50
  • 2
    Is SHA1 still ok to use for diff checks? Not all hashing applications are security related. I think throwing a blanket statement like "SHA1 is broken, You should use sha256" is misleading. SHA1 is very fast which is the requirement for certain non-security related applications. – Storm Muller Oct 26 '22 at 14:38
  • @StormMuller Hashing applications may not seem security related at first, but could be - eg a diff check could be comparing two untrusted inputs, and a maliciously constructed one could break SHA1. Unless performance is an issue - which it probably isn't - it's best to not have to worry about trading off security. – mikemaccana Oct 26 '22 at 15:48
  • 1
    And I'm sure 95% of people that land on this question are here for some security related use case. And I agree with your word of caution. I'm just saying that there are use cases where security is definitely not security related. Also performance should very much a consideration for most devs. So a blanket statement is not correct. – Storm Muller Oct 28 '22 at 07:32
  • 1
    Tip: JSON.stringify does not guarantee property order, better to sort the properties before a la https://stackoverflow.com/questions/16167581/sort-object-properties-and-json-stringify – Sander Apr 05 '23 at 10:10
  • @maerics But keep in mind that the JSON object is stringified before calculating the hash. Therefor your asumption is likely not correct. – Torsten Barthel May 15 '23 at 22:57
19

I experienced that NodeJS is hashing the UTF-8 representation of the string. Other languages (like Python, PHP or PERL...) are hashing the byte string.

Tips to get the same hash than Python/PHP,...:

We can add binary argument to use the byte string (this encoding will increase the likelihood of collisions but will be compatible to other languages).

const crypto = require('crypto')

function sha1(data) {
    return crypto.createHash('sha1').update(data, 'binary').digest('hex')
}

text = 'Your text and symbol \xac'

console.log(text, ':', sha1(text))

You can try with : "\xac", "\xd1", "\xb9", "\xe2", "\xbb", "\x93", etc...

Other languages (Python, PHP, ...):

sha1('\xac') //39527c59247a39d18ad48b9947ea738396a3bc47

Nodejs:

sha1 = crypto.createHash('sha1').update('\xac', 'binary').digest('hex') //39527c59247a39d18ad48b9947ea738396a3bc47
//without:
sha1 = crypto.createHash('sha1').update('\xac').digest('hex') //f50eb35d94f1d75480496e54f4b4a472a9148752
user2226755
  • 12,494
  • 5
  • 50
  • 73
  • 2
    `'binary'` - Alias for `'latin1'` https://nodejs.org/api/buffer.html#buffer_buffers_and_character_encodings – Jossef Harush Kadouri Dec 10 '18 at 12:04
  • 3
    ^^ Extremely important remark there by @JossefHarush! If you do not specifically need to encode the text as latin1 before hashing (e.g. exactly for compatibility with PHP), and there's any chance whatsoever that your text contains Unicode symbols outside the latin1 range (e.g. emoji!), do not use `binary`! Using `binary` or `latin1` in the encoding will _lose information_ and increase the likelihood of collisions! Try the snippet above with these two for example: `❤` and `⑤` – cbr Nov 29 '19 at 13:09
  • All hashes are done on binary data. The problem you're experiencing is that the other languages you mention are not using UTF-8, not the other way around. This will become very apparent once you try to hash something outside of Latin1. In the case of PHP in particular, the encoding is entirely determined by the source, such as the text file itself for hard-coded text. Perl might need some heavy lifting to use UTF-8. – Ryan Hanekamp Jun 12 '20 at 19:17
  • 1
    We cannot make any assumptions about the character set of the pattern being hashed, it must be binary. Nobody gives a rat;s about collisions, all implementations must be deterministic, – mckenzm Nov 04 '21 at 04:54
  • 2
    @mckenzm true, however, the 'correct' binary representation for string literals in nodejs is utf8. The problem here is using `"\xac"` and expecting the binary representation to be `[0xAC]` like in other languages that have a different native encoding. Node uses utf8, therefore `"\xac"` (or `"¬"`) in utf8 source code is interpreted as `[0xC2, 0xAC]` and the correct sha1 of that array is produced by above code. The fix is to read the data in binary in the fist place because there's no such thing as an inherently correct cross platform binary representation for strings. – zapl Sep 05 '22 at 20:43
12

You can use:

  const sha1 = require('sha1');
  const crypt = sha1('Text');
  console.log(crypt);

For install:

  sudo npm install -g sha1
  npm install sha1 --save
user944550
  • 125
  • 1
  • 6
7

Please read and strongly consider my advice in the comments of your post. That being said, if you still have a good reason to do this, check out this list of crypto modules for Node. It has modules for dealing with both sha1 and base64.

mikemaccana
  • 110,530
  • 99
  • 389
  • 494
Alex Turpin
  • 46,743
  • 23
  • 113
  • 145
3

Answer using the new browser compatible, zero dependency SubtleCrypto API added in Node v15

const crypto = this.crypto || require('crypto').webcrypto;

const sha1sum = async (message) => {
  const encoder = new TextEncoder()
  const data = encoder.encode(message)
  const hashBuffer = await crypto.subtle.digest('SHA-1', data)
  const hashArray = Array.from(new Uint8Array(hashBuffer));                     // convert buffer to byte array
  const hashHex = hashArray.map(b => b.toString(16).padStart(2, '0')).join(''); // convert bytes to hex string
  return hashHex;
}

sha1sum('foo')
  .then(digestHex => console.log(digestHex))

// "0beec7b5ea3f0fdbc95d0dd47f3c5bc275da8a33"

Node Sandbox: https://runkit.com/hesygolu/61564dbee2ec8600082a884d

Sources:

Ray Foss
  • 3,649
  • 3
  • 30
  • 31