55

I've added a file to the 'index' with:

git add myfile.java

How do I find out the SHA1 of this file?

git-noob
  • 5,757
  • 11
  • 34
  • 32
  • 3
    Just for reference: Kind of [the inverse question](http://stackoverflow.com/questions/460331/git-finding-a-filename-from-a-sha1) – Albert Apr 05 '14 at 12:25
  • 7
    `git rev-parse :myfile.java` – jthill Nov 06 '15 at 08:31
  • @jthill, not sure why this is not an answer on its own. – akhan Sep 08 '16 at 14:49
  • @jthill -- this worked perfectly for me. I just wanted to do a git diff of two blobs, and this gave me the sha1 of each blob no problems. (NOTE: you can specify a separate branch with `:myfile.java` – DryLabRebel Jul 02 '23 at 23:18

4 Answers4

123

It's an old question but one thing needs some clarification:

This question and the answers below talk about the Git hash of a file which is not exactly the same as "the SHA1 of this file" as asked in the question.

In short:

If you want to get the Git hash of the file in index - see the answer by CB Bailey:

git ls-files -s $file

If you want to get the Git hash of any file on your filesystem - see the answer by cnu:

git hash-object $file

If you want to get the Git hash of any file on your filesystem and you don't have Git installed:

(echo -ne "blob `wc -c < $file`\0"; cat $file) | sha1sum

(The above shows how the Git hash is actually computed - it's not the sha1 sum of the file but a sha1 sum of the string "blob SIZE\0CONTENT" where "blob" is literally a string "blob" (it is followed by a space), SIZE is the file size in bytes (an ASCII decimal), "\0" is the null character and CONTENT is the actual file's content).

If you want to get just "the SHA1 of this file" as was literally asked in the question:

sha1sum < $file

If you don't have sha1sum you can use shasum -a1 or openssl dgst -sha1 (with a slightly different output format).

Edward Thomson
  • 74,857
  • 14
  • 158
  • 187
rsp
  • 107,747
  • 29
  • 201
  • 177
  • Another question, is there a hash computed on the combined file-and-folders contents of a commit? i.e. two different commits might have the same contents (`git diff` is empty`) merely because the comments, timestamps, or histories are different. Given two git repos on two machines that aren't connected to each other, how would I confirm that two commits actually have the same contents? – Aaron McDaid Aug 02 '18 at 18:50
  • @AaronMcDaid Yes, there are blobs, trees and commits. All of them have hashes. Two commits on different machines will likely have same tree root hash, but different commit hash due to different commit time. If the time is the same, then the hashes will be the same. – Ark-kun Aug 25 '20 at 04:43
68

You want the -s option to git ls-files. This gives you the mode and sha1 hash of the file in the index.

git ls-files -s myfile.java

Note that you do not want git hash-object as this gives you the sha1 id of the file in the working tree as it currently is, not of the file that you've added to the index. These will be different once you make changes to the working tree copy after the git add.

CB Bailey
  • 755,051
  • 104
  • 632
  • 656
23
$ git hash-object myfile.java
802992c4220de19a90767f3000a79a31b98d0df7
cnu
  • 36,135
  • 23
  • 65
  • 63
  • 12
    Here's why this answer got rated worse than the one by Charles: this actually gives you the SHA1 of the version of the file that's in the working tree, not of the indexed/staged version. It also has the disadvantage that it needs to recalculate the SHA1 even if it's already stored in the index. – Jan Krüger Jan 26 '09 at 23:29
  • @JanKrüger Thanks for clarifying that! Very helpful. Now, `git hash-object` is still useful for something, as otherwise one would need to do something like `( perl -e '$size = (-s shift); print "blob $size\x00"' foo.txt && cat foo.txt ) | openssl sha1`. Furthermore, it produces the hash by itself whereas the `ls-files -s` appears to require some `cut`ting to isolate that hash. – Steven Lu Jul 25 '13 at 03:21
  • 1
    How to request real SHA1? `git ls-files -s README.md` (=`git hash-object README.md`) **is not `sha1sum README.md`** in any repository! – Peter Krauss Apr 24 '17 at 12:54
0

Warning: if you need to get that SHA1 on too many files, you will get an error, because of a leak fixed with Git 2.40 (Q1 2023):

See commit 590b636 (18 Jan 2023) by Jeff King (peff).
(Merged by Junio C Hamano -- gitster -- in commit 630ae5e, 27 Jan 2023)

hash-object: fix descriptor leak with --literally

Signed-off-by: Jeff King

In hash_object(), we open a descriptor for each file to hash (whether we got the filename from the command line or --stdin-paths), but never close it.

For the traditional code path, which feeds the result to index_fd(), this is OK; it closes the descriptor for us.

But 5ba9a93 ("hash-object: add(man) --literally option", 2014-09-11, Git v2.2.0-rc0 -- merge) added a second code path, which does not close the descriptor.
There we need to do so ourselves.

You can see the problem in a clone of git.git like this:

$ git ls-files -s | grep ^100644 | cut -f2 |
  git hash-object --stdin-paths --literally >/dev/null
fatal: could not open 'builtin/var.c' for reading: Too many open files

After this patch, it completes successfully.

With Git 2.39, one of my repositories does show the error:

vonc@vclp MINGW64 ~/git/seec2 (main)
$ git ls-files -s | grep ^100644 | cut -f2 | git ls-files -s | grep ^100644 | cut -f2 | \
git hash-object --stdin-paths --literally
fatal: could not open 'data/commits/7f/a3338870d66dd3946c5c3a0bd09dadb798893d' for reading: Too many open files
VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250