0

Earlier today I was in the process of splitting a large git commit into several smaller commits and so as usual I created a new branch, did a git reset HEAD^, and proceeded with git add -p to interactively select pieces of this large commit to split into the first of these smaller commits. I spent 20-30 minutes selecting the lines I wanted and then decided to commit the first patch. Out of habit, I reflexively typed git commit -a instead of just git commit and hit enter, and of course I lost all of my patch selections and I've ended up with exactly what I had before: one big commit.

Is there any way to revert to the previous version of my staged changes? What I think I want is to revert to the immediately prior version of my .git/index file, however from what I can tell staged changes are not versioned. So I think I'm screwed and I just have to start over.

I've been using Git daily for several years, and this has happened a few times in the past as well. Each time I realize what I've done, I facepalm and start over. I suspect this could be worked around with a pre-commit hook that makes a copy of .git/index (perhaps to .git/index.bak) to get one level of manual "undo", but I'm not sure that would be sufficient as I don't have a really deep understanding of Git internals. The best place I found for internals information is Chapter 10 of the Git book, although I've considered looking into the Git source code to get a better understanding of how the staging area works.

Thanks in advance for any ideas.

Michael Percy
  • 293
  • 1
  • 9

2 Answers2

3

TL;DR: you can use git fsck --lost-found

But it's going to be painful.

Edit: if your goal is to prevent yourself from doing this in the future, that's a bit tricky. You can save the original (pre-commit) index file somewhere; this index file has in it the hash IDs of the blob objects holding the carefully staged files, before the --all overwrote them. However, finding the path of this index file is tricky: your pre-commit hook is called with GIT_INDEX_FILE set to the path of the (now locked) "real" index containing the updated hashes, rather than the preserved-for-rollback index. You can use the horrible hack that it's probably just .git/index, but this is wrong in the presence of added work-trees (see git-worktree). See also builtin/commit.c near line 400.

A safer trick might be to write your own git front end script or alias (see, e.g., this answer). If the command you're running is git commit and one of the arguments is -a, check whether the index matches HEAD (use git diff-index --cached; see require_clean_work_tree in git-sh-setup.sh. If not, you may want to require some sort of "I really mean it" option.

Long description

Unfortunately, when using --all or --include with pathspecs, Git simply overwrites the existing index contents with the additional or all files. (Well, if the commit fails, Git will restore the old index, but presumably the commit did not fail.) When using --only with pathspecs, the situation is more complicated, but that's not the case here.

What this means is that the interactively patched versions of the files have been lost.

On the bright side, lost here means lost, not destroyed: they are still in your repository, they just have no easy way to be found. Objects normally remain in the repository for at least 14 days after they were first written, to give the various parts of Git sufficient time to attach names to them by which they can be found. (Without this grace period, any random background operation that triggered an automatic git gc could destroy objects some other Git process is still working on.)

What this means is that you can run:

git fsck --lost-found

and Git will slowly and painfully traverse every object in the repository, not just the ones that are easily found, to find out whether the object can be found. If the object has a normal way to find it—e.g., is in a commit—then nothing special happens. If the object does not have such a way to find it, Git calls it a dangling object (more precisely, a dangling blob or dangling commit—there's a technical thingy here for commits that is special, that does not apply to blobs).

The word blob here essentially means file. With --lost-found, these "dangling blobs" are copied out to their expanded, original-text file representation, and stuffed into .git/lost-found/other/. The main problem here is that the name of the file is truly destroyed (it was only in the index), so these file are all named by their hash IDs now.

There will probably be many, many versions of each file here. You will have to examine them all and figure out that 9ab30ad... is slightly wrong-er than 57e4eea..., but oops, look at this, e83c1d7 is slightly better than either of those, and ... well, you get the idea. :-)

If, by some chance, the version of file foo that was patched happened to exactly match any version of any file in any commit—for instance, maybe the correct version of x.doc was empty, and you have another empty file too that is committed—then there won't be a dangling blob for that file, because there is only one copy of any specific version of data. (For instance, the empty file has hash ID e69de29bb2d1d6434b8b29ae775ad8c2e48c5391. Every empty file has this same hash! The file hello\n has hash ce013625030ba8dba906f756967f9e9ca394464a. To see this, run echo hello | git hash-object -t blob --stdin.)

Community
  • 1
  • 1
torek
  • 448,244
  • 59
  • 642
  • 775
  • Unfortunately this didn't work. I found a bunch of commits in commits/ but in other/ I only got 20 files. I went through all of them and there were a bunch of changes in my previous "git add -p" that did not appear there. :( – Michael Percy May 21 '17 at 04:35
  • Hm, that's odd. The "matches version in some other commit" is a possibility. Otherwise, if it *was* staged, it should have become a dangling blob. – torek May 21 '17 at 07:08
  • Maybe that was the problem because I'm in the habit of making many small commits and then squashing / splitting to clean them up before pushing them upstream for review. Thanks so much for your amazing answer. Perhaps I should mark this answer as accepted and also edit my question noting that I wasn't able to recover my staged diffs using this technique. I like your script/alias idea, I'm going to give it a shot. – Michael Percy May 21 '17 at 09:08
0

I wrote a git commit hook (~/.git/hooks/pre-commit) that aborts git commit -a if there are staged changes (i.e. you used git add -p before). It is not 100% bulletproof, but works well for me:

#!/bin/bash

# `git commit -a` changes GIT_INDEX_FILE to .git/index.lock:
if GIT_INDEX_FILE=.git/index git diff --cached --diff-filter=ad --quiet ; then
    exit 0  # no modifications staged, no problem
fi

if git diff --diff-filter=ad --quiet ; then
    exit 0  # no modifications left over, no problem
fi

check_args () {
    while [ $# -gt 0 -a "$1" != commit ] ; do shift ; done ; shift
    while [ $# -gt 0 ] ; do
        case $1 in
            -a|--all)
                echo >&2 "Attention, staged changes and $1, aborting!"
                return 1
                ;;
            -m|--message|--author|--date|--cleanup|--pathspec-from-file|\
            -t|--template|-F|--file|--fixup|--squash|-c|--reedit-message|\
            -C|--reuse-message)
                # options with argument
                shift ; shift
                ;;
            --)
                shift ; break
                ;;
            -*)
                shift
                ;;
            *)
                break
                ;;
        esac
    done
    if [ $# -gt 0 ] ; then
        echo >&2 "Attention, staged changes and pathspec given, aborting!"
        return 1
    fi
}

pstree=$(pstree -spl $$) || exit 1
gitpid=$(echo "$pstree" | sed -n 's/^.*---git(\([0-9]\+\))---.*$/\1/p')
if [ -z "$gitpid" ] ; then
    echo "Could not find PID of git process, aborting!"
    echo "pstree: $pstree"
    exit 1
fi

mapfile -d '' cmdline < "/proc/$gitpid/cmdline"
check_args "${cmdline[@]}"
exit $?

jmuc
  • 1,551
  • 1
  • 14
  • 16