This is a little more precise than LeGEC's answer although that one covers the most common case. What git log --find-object
does is find commits where, from parent to child, the commit changes the number of occurrences of that particular blob.
Suppose, for instance, we create a new empty repository with one initial commit with a README file:
$ mkdir tlog
$ cd tlog
$ git init
Initialized empty Git repository in [path]
$ echo test find-object stuff > README
$ git add README
$ git commit -m initial
[master (root-commit) 2177143] initial
1 file changed, 1 insertion(+)
create mode 100644 README
Now let's create a blob, commit it, and observe its hash ID:
$ echo file content > afile
$ git add afile
$ git commit -m 'add some content'
[master 45c4e39] add some content
1 file changed, 1 insertion(+)
create mode 100644 afile
$ git rev-parse HEAD:afile
dd59d098638313f5d00a7fa657379b33b191f2e2
$ blobid=$(git rev-parse HEAD:afile)
Now let's make a commit that doesn't change the number of files that have that blob hash ID, by adding a file with different content, then add a third file with the same content—hence same blob hash ID—as the first file:
$ echo different > bfile
$ git add bfile && git commit -m 'add different content'
[master c5a5306] add different content
1 file changed, 1 insertion(+)
create mode 100644 bfile
$ cp afile cfile && git add cfile
$ git commit -m 're-add same content as afile, ie, same blob id'
[master 20c97e5] re-add same content as afile, ie, same blob id
1 file changed, 1 insertion(+)
create mode 100644 cfile
$ git rev-parse HEAD:cfile
dd59d098638313f5d00a7fa657379b33b191f2e2
As you can see, the same hash ID comes up again. (In fact, any repository with a file that matches my afile
or cfile
has that blob hash ID in it! The commits will have unique hash IDs, but any file that reads file content
plus a single newline will have blob hash ID dd59d098638313f5d00a7fa657379b33b191f2e2
.)
Now let's look at git log --oneline
and git log --oneline --find-object=$blobid
output:
$ git log --oneline
20c97e5 (HEAD -> master) re-add same content as afile, ie, same blob id
c5a5306 add different content
45c4e39 add some content
2177143 initial
$ git log --oneline --find-object=$blobid
20c97e5 (HEAD -> master) re-add same content as afile, ie, same blob id
45c4e39 add some content
We see commit 45c4e39
in both cases because comparing 2177143 initial
to 45c4e39 add some content
shows that the number of files that have $blobid
as their object hash has gone from zero to one. We see 20c97e5
because comparing that commit to its parent, c5a5306
, shows that the number of files has gone from 1 to 2. If we remove one copy, the count will change again and we'll see that commit. If we remove both copies, the count will change (to zero) and we'll see that commit.
What we're seeing, in other words, is every commit in which the count of blob objects with the given hash ID changes.
There's a bug, of sorts, in this git log
option: it relies on the fact that each of these commits has one single parent. If we have a merge commit—a commit with two or more parents—Git has to compare the blob hash IDs in the merge to both parents. Perhaps the count changes in one comparison but not in the other. What should Git do with this? Git's current answer is that it craps out completely here—hence "bug of sorts"—but with a fix that's in the queue, you get something that's better but still imperfect, as there's no obvious Right Answer for this case. (The bug is that Git is going through a special code path in git log
that's meant to handle History Simplification, and that's the wrong thing to do here. The proposed fix makes Git go through a more suitable path, so that you'll at least see that the merge has some change in the count, which is clearly significantly better. But that leaves other cases for other options that don't always work right, too. Git needs a general solution for diffs-across-merges, and that requires a framework that currently does not exist.)