9

Uploading file in chunks using Blob API. Here I want to check the md5 checksum of the blob. When I tried the below code it is working fine for text files, but it is returning different value for binary files.

var reader = new FileReader();
reader.readAsBinaryString(blob);
reader.onloadend = function () {
    var mdsum = CryptoJS.MD5(reader.result);
    console.log("MD5 Checksum",mdsum.toString());
};

How to calculate the md5 checksum of blob correctly for all types of files ?

t.niese
  • 39,256
  • 9
  • 74
  • 101
Awesome
  • 5,689
  • 8
  • 33
  • 58
  • 1
    http://stackoverflow.com/questions/17819820/how-to-get-correct-sha1-hash-of-blob-using-cryptojs – Mi-Creativity Dec 28 '15 at 11:29
  • You're using CryptoJS. This question have nothing to do with jQuery. – Oleg V. Volkov Dec 28 '15 at 11:31
  • Please be aware that MD5 is considered "cryptographically broken and unsuitable for further use". Unless you have to use it for compatibility with an externally provided service, consider switching to SHA-2 (SHA-256 etc.). – jcaron Dec 28 '15 at 12:17
  • https://stackoverflow.com/a/61823010/926519 – Miguel May 15 '20 at 15:36

1 Answers1

15

Use the following code to create a correct md5 hash:

  function calculateMd5(blob, callback) {
    var reader = new FileReader();
    reader.readAsArrayBuffer(blob);
    reader.onloadend = function () {
      var wordArray = CryptoJS.lib.WordArray.create(reader.result),
          hash = CryptoJS.MD5(wordArray).toString();
      // or CryptoJS.SHA256(wordArray).toString(); for SHA-2
      console.log("MD5 Checksum", hash);
      callback(hash);
    };
  }

Update (a bit simpler):

 function calculateMd5(blob, callback) {
    var reader = new FileReader();
    reader.readAsBinaryString(blob);
    reader.onloadend = function () {
      var  hash = CryptoJS.MD5(reader.result).toString();
      // or CryptoJS.SHA256(reader.result).toString(); for SHA-2
      console.log("MD5 Checksum", hash);
      callback(hash);
    };
  }

Be sure to include core.js, lib-typedarrays.js (important) and md5.js components from CryptoJS library.
Please see this fiddle for a complete example (because of origin access control it won't work on fiddle, try it on your local server).

Dmitri Pavlutin
  • 18,122
  • 8
  • 37
  • 41
  • you sould define the `onloadend` method before the call to `readAsBinaryString` or `readAsArrayBuffer` : otherwise, with a small enough buffer, you might get the `onloadend` event handler registered after the event was triggered. – Thierry Mar 23 '17 at 15:03
  • @Thierry Can you show a demo of that? The `readAsBinaryString` method initiates a new task in the queue, which is not executed in the same stack when `reader.onloadend` is setup (as an effect of JavaScript's event loop). So the order of `reader.readAsBinaryString` and `reader.onloadend` doesn't matter. – Dmitri Pavlutin Mar 23 '17 at 15:54
  • Well, it happened to me this afternoon when using this code. The method that is triggering the call to `onloadend` is `readAsBinaryString` (from mozilla : `When the read operation is finished, the readyState becomes DONE, and the loadend is triggered`) and in your code, you register the event handler *after* the method that will trigger it. Perhaps it will work 99% of the time, but it will fail sometime too. Inverting both operations will be working in every case. – Thierry Mar 23 '17 at 21:50
  • @Thierry Can you simply show me a demo to demonstrate the problem? It doesn't matter in which order these lines are executed, because the read operation starts a new task in the queue. The [specification](https://w3c.github.io/FileAPI/#dfn-readAsBinaryString) says: `Initiate an annotated task read operation using the blob argument as input and handle tasks queued on the file reading task source per below.` – Dmitri Pavlutin Mar 24 '17 at 06:39
  • It is a race condition. Demonstrating it in every case on a platform with good multithreading is often hard without using `Thread.sleep`. But this is javascript, mutlithreading behavior will depend on the browser used, and there is no Thread.sleep. But race condition can also be uncovered with a debugger, and stepping in the code. Here is a jsfiddle : http://jsfiddle.net/k73va9ob/2/ . Open it, open the dev console (F12), run it, click the button, and click the 'play' button once the breakpoint is hit. You'll see that the `onloadend` method is not called. – Thierry Mar 24 '17 at 09:02
  • @Thierry Thank you for providing the demo. I tried both simple run and with the debugger and it works as expected (Chrome v57). JavaScript is single threaded, no race conditions can happen. Maybe the problem is in the way a specific engine implements the API. – Dmitri Pavlutin Mar 24 '17 at 09:21
  • Indeed. I've just tested with chrome, and it is working as expected. Firefox is showing the issue. A firefox bug perhap's ? – Thierry Mar 24 '17 at 09:41
  • @Thierry It could be. – Dmitri Pavlutin Mar 24 '17 at 09:52
  • Not sure why, but the "update" version of the code is not returning me correct MD5 hash on blob (with img data). The original one is returning the right one. – pagep Jul 12 '19 at 10:43
  • could you explain, why the author's code is wrong. Why cant just `CryptoJS.MD5(reader.result);` – sinbar Mar 25 '21 at 09:08
  • for me using this produced the right result var hash = CryptoJS.MD5(CryptoJS.enc.Latin1.parse(reader.result)).toString(); – abhinav gajurel Jul 12 '21 at 23:50