How many possible hash values does one need to avoid clashes among N
items? If you recall birthday paradox, the answer is much smaller than N
.
Let's reverse the question: for N=16^10
possible hash values, which corresponds to 10 hex digits of abbreviated git revision codes, with how many revision the probability of a revision hash coincidence rises to 50%? A direct calculation shows that if you have 1234603 revisions the probability that two of them would have the same 10-digit hash is 50%.
Now, a million or so revisions is not unheard of in large active repositories. Have anybody here experienced a git hash clash in your work? Theoretically speaking, that ought to have happened.