1

My situation is (roughly analogous to) the following:

We have the directory structure

my-repo/
  input.txt
  output-1.bin
  output-2.bin
  output-3.bin
  converter.py

For simplicity, let's say that converter.py looks like this:

#/usr/bin/env python
from shutil import copyfile
copyfile('input.txt', 'output-1.bin')
copyfile('input.txt', 'output-2.bin')
copyfile('input.txt', 'output-3.bin')

We version-control both input.txt and output-*.bin. I know, I know, you're going to say that there's no reason to version-control the generated files... but this is non-negotiable in our case. We use the .bin files a lot, they're mission-critical, and we can't risk a subtle bug in converter.py screwing them up. Version-controlling both the converter.py script and the outputs makes sure that the ramifications of any script change are super obvious in the git history.

But this leads us to a problem. We can modify input.txt and commit that diff without ever running converter.py to update the .bins!

This is a perfect job for a git pre-commit hook.

We can get the list of changed files via git diff --cached --name-only --diff-filter=ACM. If that list includes input.txt or converter.py, then we want to run converter.py and then diff the output against the .bin files being committed.

So I have two problems/questions:

  • How can I run converter.py from within the pre-commit hook, without clobbering whatever uncommitted changes the user might have in his local checkout? This is basically How do I properly git stash/pop in pre-commit hooks to get a clean working tree for tests?

  • How can I then, after running converter.py, ask git "Are there now any uncommitted diffs in the working tree?" I mean I hope the answer is simply git diff, but I'm unsure what exactly git diff means when executed from inside a pre-commit hook.

The reason this problem is non-trivial is that converter.py mutates the state of the working tree, instead of just spitting its output to stdout. Sadly this, too, is a non-negotiable axiom of the problem.

Thoughts? Working code snippets? :)

Community
  • 1
  • 1
Quuxplusone
  • 23,928
  • 8
  • 94
  • 159
  • Anything that mutates the work tree is problematic because it precludes doing things in parallel. However, if the user is willing to sit there and wait through the pre-commit hooik while getting nothing else done, see your linked question, and note that `git diff` (compare working tree against index) should indeed do the trick. There may be some issues with temporary index files (`git commit ` and `git commit -a`) here as well, though. – torek Oct 07 '15 at 04:27
  • @torek (1) let's assume `converter.py` is super fast; (2) I'm hoping to hear more about the "there may be some issues" part, because I don't believe I know enough of the corner cases to code the solution myself. – Quuxplusone Oct 07 '15 at 06:02
  • I'd have to test it out and/or peer at the git source and I don't have time for either, for the next few days at least. The first thing I'd check is whether `$GIT_INDEX_FILE` is set in the environment during a `git commit -a` or `git commit `, though. – torek Oct 07 '15 at 21:17

1 Answers1

0

How about copying the script into a temp location, running there and comparing the results with the files in repo?

Mykola Gurov
  • 8,517
  • 4
  • 29
  • 27
  • This doesn't suggest changing the work tree. Copy all needed files to a temp dir, let it be the whole project, build out files and compare them with those committed. With all those constraints no solution will be much easier. – Mykola Gurov Oct 07 '15 at 11:26