If I do
git diff --staged
it compares staged with HEAD.
If I do
git diff master...HEAD
it compares master with my branch (in the GitHub PR fashion)
How can I combine both so it compares master with staged?
If I do
git diff --staged
it compares staged with HEAD.
If I do
git diff master...HEAD
it compares master with my branch (in the GitHub PR fashion)
How can I combine both so it compares master with staged?
There is no way to compare the third commit that you didn't specify directly to the staging area without using at least one additional Git command. If you want to pick out a merge base between the branch tip commit specified by branch name B
and the contents of the index / staging-area, you must use two Git commands. As a two-line shell script, for instance, you can do:
base=$(git merge-base <B> HEAD)
git diff --cached $base # or --staged
You can make a Git alias that does this in one line, e.g.:
git config --global alias.bvs='!f() { git diff --staged $(git merge-base $1 HEAD); } f'
(with the name bvs
standing for Base Vs Staged) if you want to shorten how much you must type. (This alias is slightly flawed as it doesn't verify that you actually gave exactly one argument. It also does not test whether there is exactly one such merge base commit, but neither does git diff
; see below.)
Be careful with the dotted syntax: git diff
treats it specially.
You can:
git diff
git diff --staged # or --cached, exactly the same thing
git diff <commit>
git diff --cached <commit>
git diff <commit1> <commit2>
git diff <commit1>..<commit2> # note two dots
git diff <commit1>...<commit2> # note three dots
(and more!—but the "more" cases don't fall into this main pattern). In each of these various cases, you are selecting two things to compare:
--staged
--staged
or --cache
So far, all of these use the commit you specified, or both commits that you specified. But:
git diff <commit1>...<commit2> # note three dots
actually compares some third commit to commit2
.
The third commit that Git chooses is based on the result of running:
git merge-base --all <commit1> <commit2>
or (closer to what git diff
actually does internally):
git rev-parse <commit1>...<commit2>
The git merge-base
command finds all of the merge bases of the two specified commits, based on the commit graph. The commit graph is the result of interpreting all the commits' stored parents. The diagrams in the accepted answer to the question to which you linked in a comment help explain merge bases visually. (This link probably should have been in your original question,1 as it helps define the issue.)
If you run the actual git rev-parse
shown here, you'll see output like this:
$ git rev-parse master...origin/maint
da72936f544fec5a335e66432610e4cef4430991
083378cc35c4dbcc607e4cdd24a5fca440163d17
^da72936f544fec5a335e66432610e4cef4430991
This is rev-parse's textual representation for:
master
: the tip of branch master
;origin/maint
: the tip of some other Git's maint
branch as of the last time I had my Git call up that other Git (origin/maint
is my remote-tracking name for their maint
branch); andA merge base is a commit that is reachable from both tip commits. It's more constrained than that: it's the best such commit.2 There can be more than one "best" commit, and in this case, git merge-base --all
will find all of them. The git rev-parse
command will too: each one needs to be excluded with a prefix ^
character.
What git diff A...B
, with three dots, does is to call the internal equivalent of git rev-parse
to find all the merge bases. (You can do this yourself with git merge-base --all
or by using git rev-parse
, though you don't get the commits with all the internal flags that the internal interface uses.) Then, having noticed that there are some positively selected commits and at least one negatively selected commit, git diff
picks one of the negatively selected commits, at apparent-random, from the list. So if git rev-parse
produces one merge base, git diff
picks that one merge base. If git rev-parse
lists two or more, git diff
still just picks one, rather than producing a warning or error.
To do this manually, just run git merge-base
without --all
, which does essentially the same thing.3 So that's what we should do.
Having found the merge base, git diff
now compares the merge base commit to the right side commit hash (the second line of the output of git rev-parse
). So that, too, is what we should do.
This gives rise to the two step operation we wrote out, and then had the shell do in one command-line command (that runs two Git commands) in the alias.
1Note that in your question you said:
git diff master...HEAD
... compares master with my branch (in the GitHub PR fashion)
In fact, GitHub's comparison isn't quite like this and I have no idea how they produce exactly what they produce. But this—that is, the three-dot syntax—does use the merge base.
2The definition of best is a bit tricky. There are two good graph theoretical definitions of the Lowest Common Ancestor of a graph, both from a paper in Journal of Algorithms by Bender, Michael A and Farach-Colton, Martín and Pemmasani, Giridhar and Skiena, Steven and Sumazin, Pavel titled Lowest common ancestors in trees and directed acyclic graphs:
Definition 1. Let G = (V, E) be a DAG, and let x, y ∈ V. Let Gx,y be the subgraph of G induced by the set of all common ancestors of x and y. Define SLCA(x, y) to be the set of out-degree 0 nodes (leafs) in Gx,y. The lowest common ancestors of x and y are the elements of SLCA(x, y).
Definition 2. For any DAG G = (V, E), we define the partially ordered set S = (V, ≼) as follows: element i ≼ j if and only if i = j or (i,j) is in the transitive closure Gtr of G. Let SLAC(x, y) be the set of the maximum elements of the common ancestor set {z | z ≼ x ∧ z ≼ y} ⊆ V. The lowest common ancestors of x and y are the elements of SLAC(x, y).
3Note that these may pick different merge base commits! But you don't control which one git diff
picks, so you can't really care too much if git merge-base
picks a different one.
Try
git diff --cached master
git diff master
, git diff --staged master
and git diff --cached master
will give the same result from another branch than master
, it will take your staged changes into account.
If in the contrary you had wanted to diff without taking staged changes into account, then you'd have to use git diff HEAD...master