0

I am trying to add a resume upload capability to my system, in which I compute the SHA256 hash for the file on client-side js, and then compare that hash to server-side DB if this file was already uploaded ( or save the hash as it is if it is a new file ).

But some of the files can exceed 10s of GBs, and calculating the whole file hash for which will be very time-consuming. So I am calculating the hash for only the first 500MB of the file.

Is there any significant chance of collision here?

Prakhar Londhe
  • 1,431
  • 1
  • 12
  • 26
  • https://stackoverflow.com/questions/4014090/is-it-safe-to-ignore-the-possibility-of-sha-collisions-in-practice – nlanson May 25 '22 at 05:57
  • This is a little different from what I have asked. I know the probability of two non similar files' SHA256 colliding is impossibly low, but here I am more interested in knowing the possibility of 2 files having same first 500 MBs. – Prakhar Londhe May 25 '22 at 20:45
  • There are 500000000 bytes in 500mb, each of which can take on any value between 0 and 255. Hence the probability of two different files having the exact same starting 500mb is 256^-500000000. – nlanson May 26 '22 at 01:28
  • But same applications with different versions can have some parts of their app binary similar? – Prakhar Londhe May 26 '22 at 05:43
  • I'm fairly certain that with a probability of 256^-500000000 that it is safe to ignore. – nlanson May 26 '22 at 13:14

0 Answers0