3

I have a file I found laying around. I want to know if it came from a specific git repo, at some point in time. Exact-match comparison is good enough. How can I do this?

Filip Haglund
  • 13,919
  • 13
  • 64
  • 113
  • `Exact-match comparison` - would comparing `md5sum` be enough? If so, you can use [this script from another SO answer](https://stackoverflow.com/a/32849134/3691891) like this `git-dump.sh ` and then check if any of the output file matches original file's `md5sum`. – Arkadiusz Drabczyk Jan 02 '18 at 09:35
  • @ArkadiuszDrabczyk doesn't git have sha1 hashes of all files internally already? – Filip Haglund Jan 02 '18 at 09:37
  • 1
    Possible duplicate of [Which commit has this blob?](https://stackoverflow.com/questions/223678/which-commit-has-this-blob) – Denys Séguret Jan 02 '18 at 09:56
  • 2
    @DenysSéguret IMHO not a full duplicate, the linked question starts with a given hash. [Your asnwer here](https://stackoverflow.com/a/48058743/711006) clearly states that we need to generate the hash of the file and how to do that. – Melebius Jan 02 '18 at 10:10

1 Answers1

3

You can generate the SHA from a file using

git hash_object <file path>

This gives you a hash such as this one:

c675fb0fe881673391f078c37e594ec7a51aa222

It's also possible to list all (reachable) blobs and filenames using a command like this one (many variations possible).

Using this, you can grep your hash:

git rev-list --objects --all | git cat-file --batch-check='%(objectname) %(objecttype) %(rest)' | grep '^[^ ]* blob' | cut -d" " -f1,3- | grep c675fb0fe881673391f078c37e594ec7a51aa222
Denys Séguret
  • 372,613
  • 87
  • 782
  • 758