1

The total possible no of unique git commit hash values are - 16^40, 16 possible hexadecimal digits and 40 total digits in the SHA value.

This approximates to ~10^48 (more than that, but just an approximation).

My question is - as the values are unique for commits, how are they not exhausted by now?

Or

Are these values unique only inside a repository i.e. locally unique which will prevent them from being exhausted ever?

As you can see, I am not sure whether they are locally unique or globally.

Edit -

The question has been answered but I would recommend this question Git hash duplicates as it is somewhat similar to my question. Thanks to @torek for mentioning this question.

  • 5
    I don't think you appreciate how big a number 10^48 actually is :) – hobbs Jul 15 '22 at 04:35
  • 3
    For comparison, it is estimated that the number of grains of sand on Earth is around 7.5*10^18. Let's round that up to roughly 10^19. By comparison 10^48 is a huge number. However, the worry currently is not the exhaustion of SHA values but deliberate generation of files with the same SHA1 hash (see: https://www.zdnet.com/article/google-breaks-sha-1-web-crypto-for-good-but-torvalds-plays-down-impact-on-git). Due to this git is slowly migrating to SHA-256 – slebetman Jul 15 '22 at 04:43
  • @slebetman That's interesting. thanks for sharing. – Antas Sharma Jul 15 '22 at 04:49
  • @hobbs Ah I did not realise that at all. – Antas Sharma Jul 15 '22 at 04:49
  • 1
    Plus... Git will [migrate to SHA-256 anyway](https://stackoverflow.com/a/47838703/6309) (and [here is why](https://stackoverflow.com/a/60088126/6309)). – VonC Jul 15 '22 at 05:19
  • @VonC I see. I had only heard of SHA-256, so thank you for sharing this – Antas Sharma Jul 15 '22 at 08:31
  • 1
    See also [Git hash duplicates](https://stackoverflow.com/q/56012233/1256452). Note my answer in particular, which mentions the birthday problem. For some pre-digested numbers, see [my answer](https://stackoverflow.com/a/34804006/1256452) to https://stackoverflow.com/q/34802500/1256452 as well. – torek Jul 15 '22 at 13:37
  • @torek I was checking stack overflow if there was something similar to my question, thi **Git hash duplicates** question that you mentioned is probably one of the only questions that comes close. But it did not show up when I used the similar questions feature of stack overflow while posting this question :/ . – Antas Sharma Jul 15 '22 at 13:50

1 Answers1

6

Pay attention to what that "48" is counting. That's how many zeroes after the leading "1".

Say there's ten billion people on earth. That's 1e10. Say all ten billion people on earth are using Git and generating ten billion hash codes each, every second, non stop. That's 1e20 hash codes used per second if we dedicate the entire human race full time with fantasy hardware. How long would it take them to get through even 0.01% of the Git hash codes? There's 1e28 left, 0.01% of that is 1e24, at 1e8 seconds per year is 1e16 years, that's ten million billion years. We'd have gotten almost 0.0000014 of the way to using 0.01% of the Git hash codes by now if we'd started before the big bang.

jthill
  • 55,082
  • 5
  • 77
  • 137