0

I need to check the md5 of an image in django and in javascript. However I don't obtain the same results. The django code give me the same md5 as md5sum on a terminal.

Django code :

file0 = request.FILES.get('file')
buffer0 = file0.read()
print hashlib.md5(buffer0).hexdigest() //9f982242d24ab97a7254edd7e28e3921

I tried two javascript md5 library, which give me the same results ( https://github.com/blueimp/JavaScript-MD5 and https://github.com/sytelus/CryptoJS ). I have tried different methods (readAsArrayBuffer, readAsBinaryString, ...) and none of them give me the same md5 as python and md5sum.

Javascript :

var reader = new FileReaderSync();
var datas = reader.readAsArrayBuffer(file);
console.log(md5(datas)); //060e4e9e30bcb9ae675a80328a87a687

var string0 = reader.readAsBinaryString(file);
console.log(md5(string0)); //2e4cac0a23ddf95683c6538d64b26e21
console.log(CryptoJS.MD5(string0).toString(CryptoJS.enc.Hex)); //2e4cac0a23ddf95683c6538d64b26e21

var string1 = reader.readAsText(file,'ascii');
console.log(md5(string1)); //329c4271b8eda786213b2468e378b251
console.log(CryptoJS.MD5(string1).toString(CryptoJS.enc.Hex));//329c4271b8eda786213b2468e378b251

var view   = new Uint8Array(datas);
var str = ""
for (var i=0, strLen=view.length; i < strLen; i++) {
  str+= String.fromCharCode(view[i]);
}
console.log(md5(str)); //2e4cac0a23ddf95683c6538d64b26e21
console.log(CryptoJS.MD5(string1).toString(CryptoJS.enc.Hex));

I think the problem come from how I read the file in JS.

Reiner
  • 1
  • 2
  • Your file might be the wrong size, as per this similar question https://stackoverflow.com/a/3431838/3194722. If not, I suggest possibly using something other than MD5, as it is broken: http://google.com(https://security.stackexchange.com/questions/15790/why-do-people-still-use-recommend-md5-if-it-is-cracked-since-1996) – Jeremy Jun 02 '17 at 16:23
  • 1
    @Jeremy To say "MD5 … is broken" one must look at the usage, it is not broken for all usages. The second [link](https://security.stackexchange.com/a/31871/5121) in the comment is broken. From that link is an answer that includes: *"MD5 is thoroughly broken with regards to collisions, but not for preimages or second-preimages."* But even the collisions case is only broken against an attacker attempting to find a collision, that is not the case for file hashes. – zaph Jun 02 '17 at 21:34
  • Alright, agreed with you on that. But for enterprise purposes, most companies/consumers are told that if a company uses md5, they shouldn't be trusted. For these purposes it shouldn't be a problem, but best practices + community conventions would recommend away from using md5. Would you agree with that? – Jeremy Jun 03 '17 at 00:52
  • @Jeremy I don't think it's a size problem. I have try with really small image (2 ko) and I still have a different md5 with js. I know that md5 is not the best for security, but for my use it is enough (I just need to check that the image was not edited by the user before I send it to the server, so a really basic check is enough). – Reiner Jun 03 '17 at 07:48
  • @Jeremy Sure, it is best to move away from MD5 and SHA1, the problem is the over-reaching statement *MD5 … is broken*. Also it may well be that SHA-2565 or SHA-512 are faster and the output can be truncated to meet the needs. – zaph Jun 03 '17 at 22:31

1 Answers1

0

I used SparkMD5 (https://github.com/satazor/js-spark-md5) which can worked directly on an array buffer and it give me the same md5 than in python.

Reiner
  • 1
  • 2