5

I am using this script to compute the n largest blops in my repository:

https://stubbisms.wordpress.com/2009/07/10/git-script-to-show-largest-pack-objects-and-trim-your-waist-line/

An example of the output:

54016,13273,0ef462bf57e8c036b00b52d6cc0fd91b2fc2a827 Data/Db.MDF
30976,8734,3e162c8313995980c8d6fc434c06789373364a47 Tools/connector.dll

The two SHAs above are SHAs for the blops. Now I would like to locate the commit/branch that contains those blops. I first tried:

$ git branch -a --contains 3e162c8313995980c8d6fc434c06789373364a47
error: object 3e162c8313995980c8d6fc434c06789373364a47 is a blob, not a commit

As the message says above the SHA is for a blob not a commit. This leads me to: Which commit has this blob?

I have created the two scripts from the above post and added those to the root of my repository. But when I run them nothing gets outputted:

MINGW64 /c/tmp/MyRepo (master)
$ ./blop-to-commit.sh 3e162c8313995980c8d6fc434c06789373364a47

MINGW64 /c/tmp/MyRepo (master)

Also tried to run it on a local bare clone of the repository:

MINGW64 /c/tmp/MyRepo.bare (BARE:master)
$ ./blop-to-commit.sh 3e162c8313995980c8d6fc434c06789373364a47

MINGW64 /c/tmp/MyRepo.bare (BARE:master)

Any ideas why I don't get the commit/branch that contained that blop at some point in history?

EDIT/SOLUTION:

Seems I just had to add the --all option to the git log command:

shift
git log --all "$@" --pretty=format:'%T %h %s' \
| while read tree commit subject ; do
    if git ls-tree -r $tree | grep -q "$obj_name" ; then
        echo $commit "$subject"
    fi
done

As suggested below:

$ git fsck
Checking object directories: 100% (256/256), done.
Checking objects: 100% (100278/100278), done.
Checking connectivity: 100342, done.
dangling commit 3f8cd0a581ec694e7371f7e4183e1cad8fa87647
dangling commit d5c0f41337ae1ef8e5cfbfd4f70077c36d231cf1
dangling commit b01831f4e6679ef2696a83e6dbaa04eaf6748f85
dangling commit 82b32531c23202d123f693bba64b040b3247636b
dangling commit 4fa8ce87c268a7ddb7c4e72d6810f70e197d5812
dangling commit 3ce38a0b8e5dbb7424a88359bbe0d9130ced34dc

I then did:

git reflog expire --expire-unreachable=now --all
git gc --prune=now

But I still don't see corresponding commits for the blobs originally listed.

Community
  • 1
  • 1
u123
  • 15,603
  • 58
  • 186
  • 303

2 Answers2

0

Because you already have both the hash of the blob and the path a very performant method to discover the SHA1 of the commit that introduced the file can discovered by using this method. Then, as you tried before you can use git branch -a --contains <sha1> to find the branches that contain the commit.

kalebo
  • 332
  • 5
  • 9
-1

You can run this script to get the size of your files (much simplier):

git ls-tree --full-tree -r --long HEAD | sort -rnk4

How to find where is the SHA-1 is in?

Note: once you add file to git (not commit but adding), git pack and store it. The packing generate the SHA-1 of the file. This means that if later no you did not commit the file but you still have the SHA-1 stored in your .git folder even if its not part of any commit tree.

Assuming that you did commit the object into branch here is how to find out in which branches its in:

git branch --contains <commit>

You can also pass tags instead of branch

git tags--contains <commit>

What do to if not commit is not in any branch/tags?

It simply means that the SHA-1 is not in any commit (as explained above).

run fsck to find out ifs its a dangling object or clean your repo before executing the size calculation.

# clean repo
git fsck --full --prune=now
Community
  • 1
  • 1
CodeWizard
  • 128,036
  • 21
  • 144
  • 167
  • 1
    I have updated my post. I did try the git branch -a --contains but as can be read from the message it does not accept blob sha's. Notice I am looking for commits for blobs back in time that are not necessarily on the tip of any branches. – u123 Feb 22 '16 at 12:30