Why does the --walk-refs (-g) option to git log disable --stat and --patch?

Question

From this answer, I saw I could get more power, compared to git stash list, by using git log -g stash (short for git log --walk-reflogs refs/stash). For instance, unlike git stash list, I can add options to narrow to stashes that affect a set of files or directories: git log -g stash -- Dir1 Dir2.

I discovered, though, that I could not get -p/--patch or --stat to work with --walk-reflogs, no matter how I ordered the parameters. The help for git log doesn't indicate any incompatibility of those options. Am I missing a way to get it to work, or is there some reason why --walk-reflogs would be incompatible with examining properties of a patch?

score 3 · Accepted Answer · answered Oct 25 '20 at 18:33

TL;DR

It's not that -g disables them, it's that git log doesn't do the right thing with stash commits in the first place.

Consider running your git log command with --format=%H to get raw hash IDs, then running a second set of Git commands (such as git stash show --stat) on each of the hash IDs found in the first step. Git is a set of tools, not a solution: each tool produces, or can produce anyway, output that serves as input to another tool.

Or—and note that this is a special case here that makes use of something only lightly documented at best—use:

git log -m --first-parent -p -g stash

so that -m --first-parent makes the -p effective. This will work with --stat as well.

Long

You're being bitten here by the fact that the commits that git stash makes are, technically speaking, merge commits.

In a normal git log, when Git is walking the commit graph instead of the reflogs, Git is looking at the parent/child relationships of each commit. For instance, suppose we have:

...--F--G--H   <-- somebranch

and we run git log somebranch. Git starts out by finding the hash ID of commit H from the branch name, which gives it easy access to commit H itself. Git loads the metadata for H, which includes the hash ID of earlier commit G, and now has both hash IDs, G and H, in memory.

With -p or --stat, git log will run a git diff on these two commit hash IDs—G and H—internally and show either the resulting diff, or the stats from that diff. Then Git will move on to commit G and show it, loading the metadata of G which produces the hash ID of earlier commit F.

The diff or stat that you see at each point is a diff between the current commit, which is a child of a single parent, and its parent, with the parent commit as the left side of the diff and the current commit as the right side. Then git log goes on to display the parent commit. So the diffs kind of sit "between" the commit and its parent. This all makes pretty good sense and logic, too.

When walking reflogs, however, we might have something like this:

...--G--H   <-- branch@{0}
         \
          I--J--K   <-- branch@{1}

This would be the case after a git reset that discarded commits I-J-K, for instance. The git log command will show commit H as a diff from commit G, then show commit K as a diff from J. As long as you're prepared for this, and understand what's happening here, that's OK: that's what git log will actually do.

But I put a phrase in bold above, about a commit with a single parent. When git log encounters a merge commit, it simply doesn't bother to do the diff at all, at least by default. That is, when the commit graph looks like this:

...--I--J
         \
          M--N   <-- branch
         /
...--K--L

and git log reaches commit M itself, it just doesn't bother to show -p or --stat output at all. It moves on from M to J, and at commit J, it does run a diff (against commit I). It also moves on from M to L—unless you've asked for --first-parent, that is—and at L, will show a diff (from K). But at M itself, it would have to do and/or show two diffs, and ... it just doesn't bother!

You can force git log to bother with these diffs, but there are several caveats. Most importantly, these diffs are generally aimed at doing something useful with the merge commit, and as such they default to omitting a lot of information. The information they omit usually totally wrecks their usefulness with git stash, because while git stash makes commits that are, technically, merge commits, these merge commits don't have a form that would be useful as a merge: when you use git stash with these commits later, the stash code takes them apart, one commit at a time, instead of using them the way merges are normally used.

The forms of stashes

The commits that git stash make take one of two forms. You either get this:

...--o--o--C   <-- branch (HEAD)
           |\
           I-W   <-- stash

or:

...--o--o--C   <-- branch (HEAD)
           |\
           I-W   <-- stash
            /
           U

When you run git stash save or git stash push, your current commit is commit C. Git finds it through the special name HEAD, which is attached to your branch name branch, which points to commit C. The stash command now builds the two (I-W) or three commits, and then updates refs/stash to point to new commit W. The two or three commits have these properties:

The I (index) commit holds whatever is in Git's index aka staging area at the time you run git stash.

If you had used git add or git add -p or some other method of updating Git's index so that the files in Git's index/staging-area did not match the files in commit C at the time you ran git stash, commit I will have some useful stuff in it. In fact, if you used git add on every file, commit I will exactly match commit W! Otherwise, if you used git add on no files, commit I will have, as its content, exactly the same set of files as commit C. Either way, due to Git's de-duplication, any shared content will literally be shared, and therefore take no disk space. But commit I will still exist, no matter what: that's how git stash knows that W is a stash commit.¹
The W (work-tree) commit holds whatever is in your working tree, as tracked files (files that are present in Git's index), at the time you run git stash.

This means commit W has what most people mostly think of as the stash content: the modified files that they had not staged-and-committed yet. It actually has all the files, including ones that are not changed, just like any other commit. The stash ref (or, later, the reflog entry) points directly to commit W, so when you run git stash show stash@{2} or git stash apply for instance, that's how git stash finds commit W, from which it finds commit I and, if it exists, commit U as well.
The U commit, if present, holds any untracked files (perhaps including ignored files) that were present in your working tree at that same time. This commit only exists if you used -u or -a or their longer spellings. It does not hold the normal files (from commit C or your work-tree or anywhere), which makes it quite an odd commit. For this reason, it has no parent commit at all: it is a root commit, like the very first commit someone makes in a new, empty repository.

After git stash makes these two or three commits, it resets things (using git reset --hard). If you made a U commit, git stash removes from your working tree all of the files that are stored in the U commit, too. What's interesting to us at this point—where we're using git log—though, is the fact that W has the form of a merge commit, but not the standard content of a merge commit: it was not built by merging its parents, but rather by making a snapshot of your working tree files, whose names were listed in Git's index / are in the I commit.²

¹This is a pretty poor test, since any merge commit at all will pass it, but it's better than nothing.

²Historically, there have been bugs here around the "intent to add" flag in the index. I have not checked to see if they are fixed for git stash in any particular version of Git, but I'd be wary of tripping over them: don't use git stash with the I-T-A stuff. Well, more generally I would say: Don't use git stash at all, except for very special short-term cases.

Forcing `git log` to diff merges

There are three options that make git log show a diff (or diff-stat) with merges:

-m tells git log to "split" a merge.

This option takes any merge commit and, for diff purposes, pretends that it was multiple separate commits, each with a single parent. Each of the virtual single-parent commits has the snapshot that the merge has, but has just one of the N parents of the merge. A standard two-parent merge thus produces two diffs, while a three-parent merge produces three diffs, and so on.

This, or running git diff manually yourself, is the only way to really see what happened in a merge. With real merges, you will often not care to see what really happened, as that's a lot of information that may be both overwhelming and irrelevant.
The -c option produces a combined diff.
The --cc option (two hyphens and two cs) produces a dense combined diff (occasionally called a "condensed combined diff", which makes the --cc spelling make sense, at least).

The combined diff options are tough to describe, but both have one key element that can make them useful with real merges, yet useless with stashes. Remember that a merge commit has two or more parents. A combined diff compares the merge commit's content against each parent's content. If the merge snapshot literally uses one of the parents' files, that file is omitted entirely from the combined diff output.

With a real merge, that means: the merge result just re-used one branch's file wholesale. Often, when inspecting merges, you don't care about this file: you only care about conflicts that had to be resolved, where the resulting file no longer matches either input parent. The -c and --cc options are designed to show you just these files.

But with a stash, the I commit often exactly matches either the C or W commit. If it does match the W commit, this will omit every file. If I matches C we're in better shape, but either way the -c and --cc options are going the wrong direction here.

Last, there's a handy special-purpose option, --first-parent. The main function of this flag is to alter how Git walks through a merge, to follow only the first parent. However, there's a secondary feature. Note that the action of this option has very recently been updated slightly so that git log --first-parent -p is now equivalent to git log --first-parent -m -p,³ but regardless of your Git vintage, you can write git log --first-parent -mp to invoke the secondary feature. Here, the m option "splits" the merge as usual, but the --first-parent combines with this action to diff only against the first parent (as well as walk only the first parent when using a regular graph walk).

³This feature is new in Git 2.29.0. To disable it, use git log -m -p --no-diff-merges.

Putting these together

What all this means in the end, for looking at stashes with git log -g, is that the -m option is necessary, so as to (a) enable the -p or --stat option and (b) make the resulting diff useful. The --first-parent option is advisable, because without it, even though Git is walking the reflogs rather than the commit graph, each stash would be shown as two (regular stash) or three (-u or -a stash) diffs.

If your Git version is 2.29 or later, you can use git log --first-parent -p -g stash: the -m is now implied. Or, regardless of Git vintage, you can use git log --first-parent -mpg, using the ability to combine single-letter flags.