10

In Git, how do I find the SHA-1 IDs of all blobs in the object database that contain a string pattern? git-grep provides only the file paths and not the sha1 IDs.

wjl
  • 7,519
  • 2
  • 32
  • 41
H Krishnan
  • 896
  • 9
  • 12

3 Answers3

6

EDIT: Update based on new testing results using Git version 2.7.4

Looks like the solution I posted only goes through the reflog. So if you delete a reflog entry, that entry will not be searched through - even though the object still exists.

So you will have to do something like:

{
    git rev-list --objects --all --grep="text"
    git rev-list --objects -g --no-walk --all --grep="text"
    git rev-list --objects --no-walk --grep="text" \
        $(git fsck --unreachable |
          grep '^unreachable commit' |
          cut -d' ' -f3)
} | sort | uniq

Derived from: Git - how to list ALL objects in the database

Old solution: Only works if object is in reflog

To find the string "text" in all local objects:

git log --reflog -Stext

To find the pattern "pattern" in all local objects:

git log --reflog --grep=pattern

This will search through all objects, so it will work even if the commit/branch is deleted. Once an object is removed from the local repository (e.g. through a gc), it will no longer be included in the search.

Community
  • 1
  • 1
vman
  • 1,264
  • 11
  • 20
  • I usually don't need to search through deleted element (don't forget the reflog is local to your repo: if you clone it again, the reflog will be empty). But for a local search in the same repo, that will indeed work. +1 – VonC May 13 '16 at 07:08
  • Right, I agree that is it not common to search through all objects. However, this is what the OP wants. Can be useful, one example I can think of is if you mistakenly committed a password and want to get rid of it from the local repository. Another would be if you deleted something from the reflog and want to recover it (assuming a gc has not removed the blob). – vman May 13 '16 at 19:42
  • 1
    I have tried but It didn't work (grep just not work this case, document says git-rev-list --grep is for commit message but not blob content, maybe it is the reason?). But I found another solution following the guidance: `git grep 'text' $(git fsck --unreachable | grep '^unreachable blob' | cut -d' ' -f3)` did search all unreachable blobs for the text – William Leung Oct 07 '17 at 19:49
4

I did the following to figure out if some code I'd written was lost forever, or was perhaps hidden in some "unreachable" commit:

# Assuming you're at the root of the git repository
> cd .git/objects
# Search all objects for a given string (assuming you're in .git/objects)
> find ?? -type f | sed s!/!! | git cat-file --batch | grep --binary-files=text <SEARCH_STRING>

This will produce output if any git object contains <SEARCH_STRING>. However, it won't tell you which object contains it. But by doing this, I found my missing code and I was eventually able to get it back.

qff
  • 5,524
  • 3
  • 37
  • 62
3

You can try a git log using the pickaxe option:

git log -Sstring --all

See "How to find commit SHA1 of a tree containing a file containing a given string"

Community
  • 1
  • 1
VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • This only searches through the refs. not all objects like the OP wants. Once a commit/branch is deleted, your search will not work. I posted an answer using a modified version of your solution. – vman May 13 '16 at 00:12