46

One of the things I keep in my open novel in GitHub is a list of words I would like to set automatically the first line, which is the number of words in the dictionary. My first option is to write a pre-commit hook that reads the file, counts the words, rewrites the first line and writes it back again. Here's the code

PRE_COMMIT {
  my ($git) = @_;
  my $branch =  $git->command(qw/rev-parse --abbrev-ref HEAD/);
  say "Pre-commit hook in $branch";
  if ( $branch =~ /master/ ) {
     my $changed = $git->command(qw/show --name-status/);
     my @changed_files = ($changed =~ /\s\w\s+(\S+)/g);
     if ( $words ~~ @changed_files ) {
       my @words_content = read_file( $words );
       say "I have $#words_content words";
       $words_content[0] = "$#words_content\n";
       write_file( $words, @words_content );
     }
   }
};

However, since the file has already been staged, I get this error

error: Your local changes to the following files would be overwritten by checkout: text/words.dic Please, commit your changes or stash them before you can switch branches. Aborting

Might it be better to do it as a post-commit hook and have it changed for the next commit? Or do something completely different altogether? The general question is: if you want to process and change the contents of a file during commit, what's the proper way of doing it?

jjmerelo
  • 22,578
  • 8
  • 40
  • 86

2 Answers2

39

The actual commit stuck in by git commit is whatever is in the index once the pre-commit hook finishes. This means that you can change files in the pre-commit hook, as long as you git add them too.

Here's my example pre-commit hook, modified from the .sample:

#!/bin/sh
#
# An example hook script to verify what is about to be committed.
# [snipped much of what used to be in it, added this --
#  make sure you take out the exec of git diff-index!]

num=$(cat zorg)
num=$(expr 0$num + 1)
echo $num > zorg
git add zorg
echo "updated zorg to $num"
exit 0

and then:

$ git commit -m dink
updated zorg to 3
[master 76eeefc] dink
 1 file changed, 1 insertion(+), 1 deletion(-)

But note a minor flaw (won't apply to your case):

$ git commit
git commit
updated zorg to 4
# On branch master
# Untracked files:
[snip]
nothing added to commit but untracked files present (use "git add" to track)
$ git commit
updated zorg to 5
# Please enter the commit message for your changes. Lines starting
[snip - I quit editor without changing anything]
Aborting commit due to empty commit message.
$ git commit
updated zorg to 6
# Please enter the commit message for your changes. Lines starting

Basically, because the pre-commit hook updates and git adds, the file keeps incrementing even though I'm not actually doing the commit, here.

[Edit Aug 2021: I need to emphasize that I do not recommend this approach. Note that there are some oddball cases that can come up when using git commit -a, git commit --include, and git commit --only, including the implied --only that is inserted if you name files on the command line. This is due to the fact that this kind of git commit creates a second, and sometimes even a third, internal Git index. Any git add operations you do inside a hook can only affect one of these two or three index files.]

torek
  • 448,244
  • 59
  • 642
  • 775
  • So basically it's a different kid of hack, right? Doing it my way the file is changed, but not committed until the next one – jjmerelo May 30 '13 at 09:46
  • 1
    Yes. I'm not sure I would be really happy with either approach; I'd rather have something like a Makefile to update things as needed, and something a bit more manual. But it should work. – torek May 30 '13 at 09:48
  • 2
    You could have you script do a `commit -a --amend` after your actual commit. – Yawar May 31 '13 at 01:49
  • 2
    This doesn't seem super-safe to me. It's possible and not uncommon for there to be changes in the working-tree vs the index. If you usee git add you'll be adding whatever state is in the working tree to the index, which would be undesirable. – tksfz Jun 06 '18 at 23:25
  • @tksfz: indeed, I don't *recommend* this at all. I just point out how it works. – torek Jun 07 '18 at 00:01
  • @tksfz one could do `git commit zorg --amend`, so only the file with the counter is added. – toolforger Aug 06 '21 at 07:39
  • @toolforger: you must be careful with `git commit --only` and `git commit --include` (`git commit zorg` means `git commit --only zorg` in this case, with neither flag given *explicitly*, but a file name given) as this changes the basic `git commit` action by creating two or even three separate index files. Any `git add` used during this special period, that lasts until the commit finishes or is aborted, affects only *one* of the multiple index files, with results that can be very confusing. – torek Aug 06 '21 at 12:23
  • @torek do you have any background where I can read up the details? We're having plans for automatic local pre-actual-commit changes, and we wouldn't want surprises here. – toolforger Aug 07 '21 at 18:26
  • 1
    @toolforger: I've written this up in detail in multiple SO answers: [here](https://stackoverflow.com/a/65647202/1256452), [here](https://stackoverflow.com/a/51587932/1256452), and [here](https://stackoverflow.com/a/56533336/1256452) for instance. This code inside `git commit` has evolved somewhat over the last decade, and the details are not documented and therefore subject to change, so don't depend too much on any one particular behavior. – torek Aug 07 '21 at 19:25
18

It turns out you can run "hooks" - they are actually handled by another mechanism - when staging files (at git add time) :

https://git-scm.com/book/en/v2/Customizing-Git-Git-Attributes#_keyword_expansion

(scroll down a bit to the "smudge" and "clean" diagrams)

Here is what I understood :

  1. edit the .gitattributes, and create rules for the files which should trigger a dictionary update:

    novel.txt filter=updateDict

  2. Then, tell Git what the updateDict filter does on smudge (git checkout) and clean (git add):

    $ git config --global filter.updateDict.clean countWords.script

    $ git config --global filter.updateDict.smudge cat

Gerold Broser
  • 14,080
  • 5
  • 48
  • 107
LeGEC
  • 46,477
  • 5
  • 57
  • 104