172

I'm trying to do a fancy stuff here with Git hooks, but I don't really know how to do it (or if it's possible).

What I need to do is: in every commit I want to take its hash and then update a file in the commit with this hash.

Any ideas?

Mrchief
  • 75,126
  • 20
  • 142
  • 189
Felipe Kamakura
  • 2,774
  • 3
  • 19
  • 12
  • 16
    Basically I have a web application and I want to associate a installed version of that application with the exact commit that version is associated to. My initial ideia was to update a sort of about.html file with the commit hash. But after studying git's objects model, I realized that this is kind of impossible =/ – Felipe Kamakura Aug 09 '10 at 19:22
  • 37
    This is a very practical problem. I ran into it too! – Li Dong Jul 31 '13 at 09:20
  • 8
    As for me, I would like my program to write a message like this to the logs: "myprog starting up, v.56c6bb2". That way, if someone files a bug and sends me the log files, I can find out *exactly* what version of my program was running. – Edward Falk Jun 30 '16 at 20:45
  • 5
    @Jefromi, the actual use case is in fact very common, and hits beginners very easily. Having the real version somehow "imprinted" into baselined files is a basic need, and it's far from obvious why it would be a wrong idea, e.g. because that's pretty much your only option with manual revision control hacks. (Remember beginners.) Add to that that many projects simply don't have any sort of build/installation/deployment step at all which could grab and stamp the version into live files. Regardless, instead of pre-commit, the post-checkout hook could help even in those cases. – Sz. Nov 21 '16 at 13:21
  • 1
    This is impossible! If you can do this you broke the SHA-1 hash algorithm... https://ericsink.com/vcbe/html/cryptographic_hashes.html – betontalpfa May 18 '19 at 09:32

8 Answers8

106

I would recommend doing something similar to what you have in mind: placing the SHA1 in an untracked file, generated as part of the build/installation/deployment process. It's obviously easy to do (git rev-parse HEAD > filename or perhaps git describe [--tags] > filename), and it avoids doing anything crazy like ending up with a file that's different from what git's tracking.

Your code can then reference this file when it needs the version number, or a build process could incorporate the information into the final product. The latter is actually how git itself gets its version numbers - the build process grabs the version number out of the repo, then builds it into the executable.

Cascabel
  • 479,068
  • 72
  • 370
  • 318
  • 4
    Could someone further expound with a step by step on how to do this? Or at least a nudge in the right direction? – Joel Worsham Dec 01 '14 at 15:52
  • 1
    @Joel How to do what? I mentioned how to put the hash in a file; the rest is presumably something about your build process? Maybe a new question if you're trying to ask about that part. – Cascabel Dec 01 '14 at 15:56
  • 1
    In my case, I added a rule to my Makefile that generates a "gitversion.h" file on every build. See http://stackoverflow.com/a/38087913/338479 – Edward Falk Jun 30 '16 at 20:46
  • 1
    You might be able to automate this with a "git-checkout" hook. The problem is that the hooks would have to be manually installed. – Edward Falk Jun 30 '16 at 20:47
26

It's impossible to write the current commit hash: if you manage to pre-calculate the future commit hash — it will change as soon as you modify any file.

However, there're three options:

  1. Use a script to increment 'commit id' and include it somewhere. Ugly
  2. .gitignore the file you're going to store the hash into. Not very handy
  3. In pre-commit, store the previous commit hash :) You don't modify/insert commits in 99.99% cases, so, this WILL work. In the worst case you still can identify the source revision.

I'm working on a hook script, will post it here 'when it's done', but still — earlier than Duke Nukem Forever is released :))

Update: code for .git/hooks/pre-commit:

#!/usr/bin/env bash
set -e

#=== 'prev-commit' solution by o_O Tync
#commit_hash=$(git rev-parse --verify HEAD)
commit=$(git log -1 --pretty="%H%n%ci") # hash \n date
commit_hash=$(echo "$commit" | head -1)
commit_date=$(echo "$commit" | head -2 | tail -1) # 2010-12-28 05:16:23 +0300

branch_name=$(git symbolic-ref -q HEAD) # http://stackoverflow.com/questions/1593051/#1593487
branch_name=${branch_name##refs/heads/}
branch_name=${branch_name:-HEAD} # 'HEAD' indicates detached HEAD situation

# Write it
echo -e "prev_commit='$commit_hash'\ndate='$commit_date'\nbranch='$branch'\n" > gitcommit.py

Now the only thing we need is a tool that converts prev_commit,branch pair to a real commit hash :)

I don't know whether this approach can tell merging commits apart. Will check it out soon

boop
  • 7,413
  • 13
  • 50
  • 94
kolypto
  • 31,774
  • 17
  • 105
  • 99
13

Someone pointed me to "man gitattributes" section on ident, which has this:

ident

When the attribute ident is set for a path, git replaces $Id$ in the blob object with $Id:, followed by the 40-character hexadecimal blob object name, followed by a dollar sign $ upon checkout. Any byte sequence that begins with $Id: and ends with $ in the worktree file is replaced with $Id$ upon check-in.

If you think about it, this is what CVS, Subversion, etc do as well. If you look at the repository, you'll see that the file in the repository always contains, for example, $Id$. It never contains the expansion of that. It's only on checkout that the text is expanded.

  • 9
    `ident` is the hash for the file itself, not the hast of the commit. From http://git-scm.com/book/en/Customizing-Git-Git-Attributes#Keyword-Expansion: "However, that result is of limited use. If you’ve used keyword substitution in CVS or Subversion, you can include a datestamp — the SHA isn’t all that helpful, because it’s fairly random and you can’t tell if one SHA is older or newer than another." `filter` takes work, but it can get the commit info into (and out of) a file. – Zach Young Oct 22 '14 at 18:57
13

This can be achieved by using the filter attribute in gitattributes. You'd need to provide a smudge command that inserts the commit id, and a clean command that removes it, such that the file it's inserted in wouldn't change just because of the commit id.

Thus, the commit id is never stored in the blob of the file; it's just expanded in your working copy. (Actually inserting the commit id into the blob would become an infinitely recursive task. ☺) Anyone who clones this tree would need to set up the attributes for herself.

legoscia
  • 39,593
  • 22
  • 116
  • 167
  • 9
    **Impossible** task, not recursive task. Commit hash depends on tree hash which depends on file hash, which depends on file contents. You have to get self-consistency. Unless you will find a kind of *[generalized] fixed point* for SHA-1 hash. – Jakub Narębski Aug 09 '10 at 19:53
  • 1
    @Jakub, is there some kind of trick in git that will allow to create tracked files which do not modify the resulting hash? Some way to override its hash, maybe. That'll be a solution :) – kolypto Dec 28 '10 at 16:25
  • 1
    @o_O Tync: Not possible. Changed file means changed hash (of a file) - this is by design, and by definition of a hash function. – Jakub Narębski Jan 16 '11 at 22:52
  • 2
    This is a pretty good solution, but bear in mind that this involves hooks which have to be manually installed whenever you clone a repository. – Edward Falk Jun 30 '16 at 20:49
12

Think outside of the commit box!

pop this into the file hooks/post-checkout

#!/bin/sh
git describe --all --long > config/git-commit-version.txt

The version will be available everywhere you use it.

Keith Patrick
  • 137
  • 1
  • 2
  • I slightly modified your answer to ensure that the version file is always included in the commit by adding this at the end: `git add config/git-commit-version.txt`. – Jason Wheeler Oct 08 '21 at 20:37
3

I don't think you actually want to do that, because when a file in the commit is changed, the hash of the commit is also changed.

midtiby
  • 14,550
  • 6
  • 34
  • 43
  • actually, if this git commit version is needed to show in the final build then what @Cascabel has suggested is the best option. `git describe --all --long > src/assets/git-commit-version.txt` this is what I'm doing in GitHub actions build steps. – Rajendra Jan 29 '22 at 14:30
2

Let me explore why this is a challenging problem using the git internals. You can get the sha1 of the current commit by

#!/bin/bash
commit=$(git cat-file commit HEAD) #
sha1=($((printf "commit %s\0" $(echo "$commit" | wc -c); echo "$commit") | sha1sum))
echo ${sha1[0]}

Essentially you run a sha1 checksum on the message returned by git cat-file commit HEAD. Two things immediately jump out as a problem when you examine this message. One is the tree sha1 and the second is the commit time.

Now the commit time is easily taken care of by altering the message and guessing how long it takes to make a commit or scheduling to commit at a specific time. The true issue is the tree sha1, which you can get from git ls-tree $(git write-tree) | git mktree. Essentially you are doing a sha1 checksum on the message from ls-tree, which is a list of all the files and their sha1 checksum.

Therefore your commit sha1 checksum depends on your tree sha1 checksum, which directly depends on the files sha1 checksum, which completes the circle and depends on the commit sha1. Thus you have a circular problem with techniques available to myself.

With less secure checksums, it has been shown possible to write the checksum of the file into the file itself through brute force; however, I do not know of any work that accomplished that task with sha1. This is not impossible, but next to impossible with our current understanding (but who knows maybe in a couple years it will be trivial). However, still this is even harder to brute force since you have to write the (commit) checksum of a (tree) checksum of a (blob) checksum into the file.

Novice C
  • 1,344
  • 2
  • 15
  • 27
  • Is there a way that one could commit the files, then do a checkout and have the latest commit hash placed as a comment at the beginning of each source code file? Then build and run from that? – John Wooten Feb 22 '19 at 22:32
0

I prefer simply writing the exact date-time and the parent commit's hash, so in hooks/pre-commit I write the below code:

#!/bin/bash
ver_file=version.txt
> $ver_file
date +"%Y-%m-%d %T %:z" >> $ver_file
echo -n "Parent: " >> $ver_file
git rev-parse HEAD >> $ver_file
git add $ver_file
echo "Date-time and parent commit added to '$ver_file'"
exit 0

Sample auto-generated version.txt file:

2023-05-24 01:24:12 +03:30
Parent: 35acd10240a55d164b371aa28812e8e988ab0c8d

This method also keeps working when you checkout to another commit, as the version.txt file is stored exactly the way the other files are stored, and also in remote repositories and submodules for the same reason, but you have to make sure that the same file exists in .git/hooks/pre-commit for your submodule or remote repository as well.

Downsides

  1. It's not supported on GitHub and perhaps some other websites.

  2. There will always be a conflict on version.txt when performing merge, however, no further actions other than a commit -am <message> is needed since pre-commit will also be run before doing the merge's commit.