Why does using shortened commit names when comparing two commits fail?

Question

I want to compare two commits which I suspect to be identical. Why does

$ git diff 9d3fd2..893f5b
fatal: ambiguous argument '9d3fd2..893f5b': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'

$ git diff 9d3fd2 893f5b
fatal: ambiguous argument '9d3fd2': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'

fail, while having the full commit names not

$ git diff 9d3fd263dd6625078cc12b358c97ca2dca51826c  893f5ba2871b13f87365620b4d02e40519f08734

$ git diff 9d3fd263dd6625078cc12b358c97ca2dca51826c..893f5ba2871b13f87365620b4d02e40519f08734

?

I checked the output of git log, and didn't find other commits share the same shortened commit names.

Thanks.

Related (sort of): http://stackoverflow.com/questions/32405922/in-my-repo-how-long-must-the-longest-hash-prefix-be-to-prevent-any-overlap — jub0bs, May 09 '17 at 19:16

score 4 · Answer 1 · answered May 09 '17 at 18:38

I checked the output of git log, and didn't find other commits share the same shortened commit names.

There are two possibilities here:

There are other commits, but you missed them.
There are no such commits but there are other Git objects with the same shortened hashes.

To expand on the first point, git log only shows commits reachable from HEAD, or from whatever argument(s) you supply:

git log: show the HEAD commit, and earlier commits reachable from HEAD
git log master: show the tip of master, and earlier commits reachable from that tip commit
git log master develop: show the tip commits of both master and develop, and earlier commits reachable from those two tip commits
git log --branches: show the tip commits of all branches, and all earlier reachable commits
git log --all: show the tip commits of all branches, all tagged commits, the commit referred-to by stash if there is one, and so on: find all references. Use those commits to find all earlier reachable commits.

You might expect that last one to find all commits, but it doesn't: there may be commits in the reflogs, if those are enabled, that this misses; and there may be unreferenced commits, that will be garbage-collected the next time git gc runs, but are still in the rubbish bins and can still be extracted and viewed.

In any case, even if you manage to name every commit in the repository, repositories hold four types of object: commits, annotated tags, trees, and blobs. The hash ID of an annotated tag, tree, or blob is usually kept fairly well hidden, since those are more "implementation detail" than "useful data". But they still occupy repository hash space, and hence interfere with reference shortening.

The number of characters needed to make any particular shortened hash unique is difficult to predict exactly, but a probabilistic estimate is not too hard, if you know the total number of objects (of all four types) stored in the repository. But it's hard to know the total number of objects in the first place—you'll occasionally see Git counting them up, and it takes a while in a big repository! In your case:

9d3fd2

is only six characters; adding a seventh reduces the chance of ambiguity by another factor of 16, and will probably suffice. If the repository is very large, or you are very unlucky, you may need 8, 9, 10, or even more characters to get a unique hash ID.

score 2 · Answer 2 · answered May 09 '17 at 18:26

2

Probably there's more than one revision with one of the shortened IDs you are using. Try providing a longer ID (I think Linus kind-of-recently took this length to some 11 chars).

answered May 09 '17 at 18:26

eftshift0

26,375
3
36
60

Why does using shortened commit names when comparing two commits fail?

2 Answers2