Is there duplicated SHA commits?

Question

Every time you make a commit, git/hg generates a SHA to uniquely identify that commit in the repository's history.

Suppose I want to merge two repositories(which we have no information of which ones). This pops the question: if someone wanted a specific commit of that same merged repo, would there be a duplicated SHA hash that would confuse git when getting that comment? And if not so, what would git do?

Ultimately I guess the question is also: is there duplicated hashes taken every repository of the whole world?

Not a duplicate. An extension maybe, to how git would handle if that happened. — Patrick Bassut, Jan 19 '16 at 22:35
If it happens, you are the big winner of the git lottery and will become famous in the software development world! ;-) — Philippe, Jan 19 '16 at 23:41
@Philippe "how git would handle if that happen"? See http://stackoverflow.com/a/34599081/6309 — VonC, Jan 20 '16 at 06:08

score 10 · Answer 1 · edited Jun 20 '20 at 09:12

Is there duplicated hashes taken every repository of the whole world?

Possibly yes, but that's extremely unlikely. Let me quote the Git book on this one, which contains a very illustrative example:

A lot of people become concerned at some point that they will, by random happenstance, have two objects in their repository that hash to the same SHA-1 value. What then?

If you do happen to commit an object that hashes to the same SHA-1 value as a previous object in your repository, Git will see the previous object already in your Git database and assume it was already written. If you try to check out that object again at some point, you’ll always get the data of the first object.

[...]

Here’s an example to give you an idea of what it would take to get a SHA-1 collision. If all 6.5 billion humans on Earth were programming, and every second, each one was producing code that was the equivalent of the entire Linux kernel history (3.6 million Git objects) and pushing it into one enormous Git repository, it would take roughly 2 years until that repository contained enough objects to have a 50% probability of a single SHA-1 object collision. A higher probability exists that every member of your programming team will be attacked and killed by wolves in unrelated incidents on the same night.

In short: yes, a SHA1 collision is theoretically possible, but so astronomically unlikely that Git simply does not consider this case.

score 0 · Answer 2 · edited May 23 '17 at 12:07

0

Ultimately I guess the question is also: is there duplicated hashes taken every repository of the whole world?

Unlikely. So far no one has ever found a SHA1 collision. While there might be duplicate hashes these duplicate hashes will identify the very same objects with the very same contents.

Is there duplicated SHA commits?

2 Answers2

Linked