git diff between master and staged

Question

If I do

git diff --staged

it compares staged with HEAD.

If I do

git diff master...HEAD

it compares master with my branch (in the GitHub PR fashion)

How can I combine both so it compares master with staged?

score 2 · Accepted Answer · answered Jan 15 '20 at 18:07

TL;DR

There is no way to compare the third commit that you didn't specify directly to the staging area without using at least one additional Git command. If you want to pick out a merge base between the branch tip commit specified by branch name B and the contents of the index / staging-area, you must use two Git commands. As a two-line shell script, for instance, you can do:

base=$(git merge-base <B> HEAD)
git diff --cached $base            # or --staged

You can make a Git alias that does this in one line, e.g.:

git config --global alias.bvs='!f() { git diff --staged $(git merge-base $1 HEAD); } f'

(with the name bvs standing for Base Vs Staged) if you want to shorten how much you must type. (This alias is slightly flawed as it doesn't verify that you actually gave exactly one argument. It also does not test whether there is exactly one such merge base commit, but neither does git diff; see below.)

Long-ish

Be careful with the dotted syntax: git diff treats it specially.

You can:

git diff
git diff --staged                # or --cached, exactly the same thing
git diff <commit>
git diff --cached <commit>
git diff <commit1> <commit2>
git diff <commit1>..<commit2>    # note two dots
git diff <commit1>...<commit2>   # note three dots

(and more!—but the "more" cases don't fall into this main pattern). In each of these various cases, you are selecting two things to compare:

commit1 vs commit2, for the cases that do use two commit specifiers but don't use three dots
commit vs work-tree, for the cases in which you name one commit and do not use --staged
commit vs index/staging-area, for the cases where you name one commit and use --staged or --cache

So far, all of these use the commit you specified, or both commits that you specified. But:

git diff <commit1>...<commit2>   # note three dots

actually compares some third commit to commit2.

The third commit that Git chooses is based on the result of running:

git merge-base --all <commit1> <commit2>

or (closer to what git diff actually does internally):

git rev-parse <commit1>...<commit2>

The git merge-base command finds all of the merge bases of the two specified commits, based on the commit graph. The commit graph is the result of interpreting all the commits' stored parents. The diagrams in the accepted answer to the question to which you linked in a comment help explain merge bases visually. (This link probably should have been in your original question,¹ as it helps define the issue.)

If you run the actual git rev-parse shown here, you'll see output like this:

$ git rev-parse master...origin/maint
da72936f544fec5a335e66432610e4cef4430991
083378cc35c4dbcc607e4cdd24a5fca440163d17
^da72936f544fec5a335e66432610e4cef4430991

This is rev-parse's textual representation for:

the commit specified by the name master: the tip of branch master;
the commit specified by the name origin/maint: the tip of some other Git's maint branch as of the last time I had my Git call up that other Git (origin/maint is my remote-tracking name for their maint branch); and
the first commit that must be excluded in a revision walk that finds commits reachable from either of the two tip commits, but not reachable from both.

A merge base is a commit that is reachable from both tip commits. It's more constrained than that: it's the best such commit.² There can be more than one "best" commit, and in this case, git merge-base --all will find all of them. The git rev-parse command will too: each one needs to be excluded with a prefix ^ character.

What git diff A...B, with three dots, does is to call the internal equivalent of git rev-parse to find all the merge bases. (You can do this yourself with git merge-base --all or by using git rev-parse, though you don't get the commits with all the internal flags that the internal interface uses.) Then, having noticed that there are some positively selected commits and at least one negatively selected commit, git diff picks one of the negatively selected commits, at apparent-random, from the list. So if git rev-parse produces one merge base, git diff picks that one merge base. If git rev-parse lists two or more, git diff still just picks one, rather than producing a warning or error.

To do this manually, just run git merge-base without --all, which does essentially the same thing.³ So that's what we should do.

Having found the merge base, git diff now compares the merge base commit to the right side commit hash (the second line of the output of git rev-parse). So that, too, is what we should do.

This gives rise to the two step operation we wrote out, and then had the shell do in one command-line command (that runs two Git commands) in the alias.

¹Note that in your question you said:

git diff master...HEAD ... compares master with my branch (in the GitHub PR fashion)

In fact, GitHub's comparison isn't quite like this and I have no idea how they produce exactly what they produce. But this—that is, the three-dot syntax—does use the merge base.

²The definition of best is a bit tricky. There are two good graph theoretical definitions of the Lowest Common Ancestor of a graph, both from a paper in Journal of Algorithms by Bender, Michael A and Farach-Colton, Martín and Pemmasani, Giridhar and Skiena, Steven and Sumazin, Pavel titled Lowest common ancestors in trees and directed acyclic graphs:

Definition 1. Let G = (V, E) be a DAG, and let x, y ∈ V. Let G_x,y be the subgraph of G induced by the set of all common ancestors of x and y. Define SLCA(x, y) to be the set of out-degree 0 nodes (leafs) in G_x,y. The lowest common ancestors of x and y are the elements of SLCA(x, y).

Definition 2. For any DAG G = (V, E), we define the partially ordered set S = (V, ≼) as follows: element i ≼ j if and only if i = j or (i,j) is in the transitive closure G_tr of G. Let SLAC(x, y) be the set of the maximum elements of the common ancestor set {z | z ≼ x ∧ z ≼ y} ⊆ V. The lowest common ancestors of x and y are the elements of SLAC(x, y).

³Note that these may pick different merge base commits! But you don't control which one git diff picks, so you can't really care too much if git merge-base picks a different one.

score 1 · Answer 2 · answered Jan 15 '20 at 16:20

1

Try

git diff --cached master

See https://git-scm.com/docs/git-diff#Documentation/git-diff.txt-emgitdiffemltoptionsgt--cachedltcommitgt--ltpathgt82308203

answered Jan 15 '20 at 16:20

phd

82,685
13
120
165

This compares what is --cached in my branch with master, but it is not what I need. I need to compare staged with master BUT in the same way as in git diff master...HEAD see https://imgur.com/EIRoW1j or https://stackoverflow.com/questions/7251477/what-are-the-differences-between-double-dot-and-triple-dot-in-git-dif – user33276346 Jan 15 '20 at 17:22
[`git diff A...B" is equivalent to "git diff $(git merge-base A B) B`](https://git-scm.com/docs/git-diff#Documentation/git-diff.txt-emgitdiffemltoptionsgtltcommitgtltcommitgt--ltpathgt82308203). Would it satisfy you to run `git diff --cached $(git merge-base master HEAD)` ? – phd Jan 15 '20 at 17:45

score 0 · Answer 3 · answered Jan 15 '20 at 16:31

git diff master, git diff --staged masterand git diff --cached master will give the same result from another branch than master, it will take your staged changes into account.

If in the contrary you had wanted to diff without taking staged changes into account, then you'd have to use git diff HEAD...master

git diff between master and staged

3 Answers3

TL;DR

Long-ish