This is a bit of a mess.
Let's just take this part first, from your pre-commit hook:
git diff --name-only HEAD^ HEAD $(git write-tree)
The inner git write-tree
writes the index to a tree and returns its hash value. For the sake of example let's say it's 01234567
.
You then run git diff
with three commit-or-tree-ish arguments (all three arguments can be resolved to a tree identifier, which is what git cares about here):
HEAD^ HEAD 01234567
This invokes an undocumented bit of behavior in git diff: it produces a "combined diff". The inputs to the combined diff are considered to be several parents (all but the first argument) and one child (the first argument), so this treats the tree you just wrote, and the HEAD
commit in the repository, as the two parent commits, with HEAD^
as the child commit.
The git diff
documentation notes that a combined diff "lists only files which were modified from all parents." In this case, again, the two "parents" are the proposed new commit's tree (from git write-tree
) and the HEAD
commit (that which is currently at the tip of the current branch). Where both of those differ from HEAD^
(the first parent of the tip of the current branch), git will show you a diff. This is not what you want! (Since you also specified --name-only
, git will show you just the file names, not the actual diff.)
You then take those names and look in those files for git's "conflict markers" (the <<<
and >>>
marks around conflicting regions). This part is not as wrong (but is still slightly broken), but at this point things are already wrong because you may be looking at the wrong files.
Consider, for instance, the case where commit HEAD^
lacks file f2
, commit HEAD
adds file f2
, and the current index modifies file f3
but has a git conflict in it:
$ mkdir /tmp/repo; cd /tmp/repo; git init
Initialized empty Git repository in /tmp/repo/.git/
$ echo ordinary file > f1; git add f1
$ echo another ordinary file > f3
$ git add f1 f3; git commit -m initial
[master (root-commit) f181096] initial
2 files changed, 2 insertions(+)
create mode 100644 f1
create mode 100644 f3
$ echo new file f2 > f2; git add f2
$ git commit -m 'add f2'
[master c06f8d1] add f2
1 file changed, 1 insertion(+)
create mode 100644 f2
$ (echo '<<< conflict'; echo '==='; echo '>>> end conflict') > f3
$ git add f3 # but we never resolved our (fake) conflict
$ git diff --name-only HEAD^ HEAD $(git write-tree)
f2
There's the problem: the combined diff did not look at f3
as it is not modified in both "parents" vs the "child" (of course these "parent"/"child" relationships are nonsensical anyway). Without --name-only
we see the combined diff output:
$ git diff HEAD^ HEAD $(git write-tree)
diff --cc f2
index 9d57e62,9d57e62..0000000
deleted file mode 100644,100644
--- a/f2
+++ /dev/null
@@@ -1,1 -1,1 +1,0 @@@
--new file f2
If you want to check whether your proposed new commit's tree has some files with conflict markers, you need to examine the proposed "blobs", rather than the current working tree. (This is because you can git add
a file, then modify it further; or git add -p
to interactively select parts to add and parts to defer adding. Hence, the contents of the index may not match the working directory.) There are a number of ways to do this; see this question and its answer for one method, and below (using git show
with a revision-and-path) for another. The code you have now will work for some cases, but definitely not all.
With that out of the way, I see that Ikke has already answered the other issue, which is that a bare repository—the usual target for git push
operations, and the place where you would run a pre-receive hook—has no work tree, so you can't look at files in that work tree. Pre-receive hooks are generally more difficult to write as you must handle many cases:
- multiple commits
- references that are not branches (tags)
- objects that are not commits (annotated tags)
- branch creations and deletions as well as updates
When a branch (a reference of the form refs/heads/name
) is proposed to be updated, the pre-receive hook gets its current SHA-1 and a proposed new SHA-1. You can then use git rev-list
to find the sequence of objects that will be on (or no longer on) the branch if you allow the update. For each such object, if it's a commit, you would examine the tree attached to that commit, to see if all the blobs (files) in that tree pass inspection.
Please note that pre-receive
and update
hooks are very different from other git "pre" hooks: in both cases, the proposed new commits and/or annotated-tags are actually already in the repository (although they may be stripped out again if your hook rejects them), and you should generally refer to these proposed git objects by object-ID (SHA-1). (It's OK to walk the commit tree; in fact, you must do this in many cases.) The point here is that what is correct for a pre-commit hook is almost guaranteed to be wrong for a pre-receive hook, and vice versa.
A rough outline of this process might be:
NULL_SHA1=0000000000000000000000000000000000000000
check_revs()
{
local range branch rev rtype path
range=$1
branch=$2
git rev-list $range |
while read rev; do
rtype=$(git cat-file -t $rev)
case $rtype in
commit) ;;
*) continue;; # skip annotated tags
esac
git diff --name-only ${rev}^ $rev |
while read path; do
if git show ${rev}:$path | grep forbidden-item; then
echo "error: branch ${branch}: ${rev}:$path contains forbidden-item" 1>&2
exit 1
fi
done
done
}
check_branch()
{
local old new branch
old=$1
new=$2
branch=$3
if [ $old = $NULL_SHA1 -o $new = $NULL_SHA1 ]; then
# branch will be created or deleted, not updated
# do whatever is appropriate here
else
# branch will be updated, if we allow it
check_revs $old..$new $branch
fi
}
while read oldsha newsha fullref; do
case "$fullref" in
refs/heads/*) check_branch $oldsha $newsha ${fullref#refs/heads/};;
# add cases for refs/tags/* if desired, etc
*) ;;
esac
exit 0 # if we got this far it must be OK to do it
(Note that this is entirely untested. I expect it has a bug in the "file deleted" case, where there is nothing to git show
in the new revision. Also, it's not necessarily a good idea to check for <<<
and the like though. What happens if there's a text file illustrating how git conflicts look? You can choose the kind of inspection to do based on the file name, perhaps, but even then, some files might legitimately contain what look like, but are not actually, git conflict markers. If you choose to do this, make sure you allow some way around it if there is a case when it should be allowed.)