1

How to list all git commits that has a given tree hash id ? (top most tree or sub-tree objects)

I would like to seek for every commit from every branch, even dangling commits, so it's a deep seek in the whole git database.

Example - given a database with these commits:

COMMIT: a1b2c3, tree abcd00
COMMIT: 9a9b9c, tree 090807 (this tree has a sub-tree abcd00)
COMMIT: aaccdd, tree 02ff00

Looking for tree object abcd000 should list:

a1b2c3
9a9b9c

EDIT: I've tried this command, but it doesn't work for sub-trees. By the way, is it reliable to seek for top most trees on non-detached HEADs?

git  log --oneline --all --pretty="tree %T: commit %H" | grep ^"tree $mytreeid"
Zoe
  • 27,060
  • 21
  • 118
  • 148
Luciano
  • 2,695
  • 6
  • 38
  • 53
  • To search sub-trees, you must use `git ls-tree`, typically with `-r` (recursive). Tree objects are usually mere implementation detail, though: the interesting searches are generally those for blob objects. There exist answers here on SO for finding commit hashes containing specified blob hashes. – torek Jan 10 '18 at 16:27

2 Answers2

4

I think you may be misusing the term "detached HEAD" here to refer to a dangling commit (and I'm basing this answer on that assumption).

To start with, we would need a list of all the commits in a repository, including "dangling" commits that aren't currently reachable from any branch, tag, or other reference. At some point, git cat-file developed options that make this really easy:

git cat-file --batch-check --batch-all-objects

If your version of git doesn't have these options, there are other ways to get a list of all objects.

The above will produce output like:

2bd7a7d258cb8c0f529e267e72c37bfee2be3a92 tree 32
2d83b892c0922c9168d3c474e73da24301bc86bf tree 64
3eda82acd62f56b19d88a80650ed88428be8ac9b commit 231
42dae79fef2ede926a081821e6a7cf89387cd9f0 tree 66
5e9fac93b855f5cf5ed44969cf9cc53121195377 blob 29
b01735609620636d7b0179a940f7409a32041f87 commit 182

We're only interested in commits, so we would filter that through awk:

$ git cat-file  --batch-check --batch-all-objects | awk '$2 == "commit" {print $1}'
3eda82acd62f56b19d88a80650ed88428be8ac9b
b01735609620636d7b0179a940f7409a32041f87

Now, for each commit, we need to first extract the top-level tree id:

git show --quiet --format='%T' <commit id>

And then recursively list all the trees contained in that tree:

git ls-tree -r -t <tree id>

Putting it all together, we get:

#!/bin/sh

git cat-file  --batch-check --batch-all-objects |
    awk '$2 == "commit" {print $1}' |
    while read cid; do
        tree=$(git show --quiet --format='%T' $cid)
        echo $cid $tree
        git ls-tree -r -t $tree |
            awk -v cid=$cid '$2 == "tree" {print cid,$3}'
    done

This will, for every tree object discovered, output a line of the format:

<commit id> <tree id>

So the answer to your question is to pipe the output of the above script into grep <tree id> (or modify the script to only output the specific information you want).

larsks
  • 277,717
  • 41
  • 399
  • 399
  • Great answer! Could you point me to where I can get explanations that differentiates dangling commits to commits in detached HEADs ? Tnx in advance. – Luciano Jan 10 '18 at 17:39
  • Sure. A [dangling commit](https://stackoverflow.com/questions/18514659/git-what-is-a-dangling-commit-blob-and-where-do-they-come-from) means "a commit that isn't reachable by any reference". A detached head is what you get when you check out a specific commit id rather than a named reference (so it refers more to the state of your current repository than it does to an object). – larsks Jan 10 '18 at 17:44
  • So, when in a detached HEAD, if we create a chain of let's say 4 new commits, we have then a detached HEAD with 4 dangling commits and all other previously commits although in that same branch but detached HEAD aren't dangling commits (supposing we checked out an arbitrary commit in the middle of a tagged branch). Is that right ? – Luciano Jan 10 '18 at 17:51
  • 1
    Errr...yes. If you create commits on top of a detached head, then git won't be able to update any references so they would become dangling commits as soon as you change to another branch (before you do that they are still referenced by `HEAD`). – larsks Jan 10 '18 at 17:57
0

Git has a build in command to check if a given branch contains specific commit.

git branch --contains <commit>

In order to list all the branches including the remote branches and not only the local branches you need to add the -r to the command.


Another thing: if you only need to track specific file and not a commit you can use this:

# display list of commits which includes a given file
git log --<file path>

# if the file was also renamed in the past add the `--follow`
git log --follow -- <path>
CodeWizard
  • 128,036
  • 21
  • 144
  • 167
  • In my case I don't have neither the the commit hash nor file names, however the commit(s) is(are) what I'm looking for. I need to find all commits that has a given tree hash id. – Luciano Jan 10 '18 at 16:20