If I could use a URL rather than a name that would be even better, but as far as I'm aware git is only able to describe remotes after calling git remote update
.
That's correct, if a bit imprecise. What's happening "under the hood" here is that once you have done git fetch
—git remote update
just runs git fetch
, more or less, to various remotes, and you can do this from git fetch
itself, so use whichever command you prefer, they basically do the same thing here—your Git now has, in your repository, every commit that their Git has; and your Git has updated your remote-tracking branch names such as origin/master
and bean/master
and whatever additional remotes you have.
Hence, now that you have what they have and more (if there is anything more anyway), you can now figure out whether any given branch-name that you have points to:
- the same commit as some remote-tracking branch name;
- an earlier commit on the same chain of commits;
- a later commit on the same chain of commits; or
- a completely unrelated commit.
(The last case is unlikely, but should be mentioned for completeness.)
Chains of commits: the commit graph
These chains of commits are those formed by the commit graph or DAG:
A <-B <-C <-- master
represents a very simple repository with just three commits. We say that the name master
"points to" commit C
, because master
contains the big ugly hash ID that is the "true name" of commit C
. Meanwhile commit C
contains the hash ID for commit B
, so C
points to B
; and similarly, B
points to A
. B
is C
's parent, and A
is B
's parent.
Since A
was the very first commit, there is no parent commit ID it can have; so it has none. This makes it a root commit, and it points nowhere, which means git log
can stop printing things. We—or git log
—will start with the name master
and view commit C
, then follow C
's arrow back to B
and view B
. Then it will follow the backwards arrow to A
and view A
, and now it's out of things to follow and is done.
When you git fetch
a new commit D
that has C
as its parent, we get the slightly more complicated picture:
A--B--C <-- master
\
D <-- bean/master
It's easy to see from this drawing that bean/master
is "one commit ahead" of master
. But internally, all Git arrows work backwards, so in fact, Git has to start from bean/master
and work back, and when it finds commit C
which is master
, that's when we know that master
is one behind bean/master
.
As AnimiVulpis answered (upvoted), you can get git rev-list
to count commits for you, using --count
. Normally it just lists the commit hashes. It's just like git log
: it starts at the commit you tell it to start at, and follows the internal "arrows" backwards from one commit to another. If you give it a stopping point, it stops when it reaches a commit that is the stopping-point or—this part is a bit tricky—is reachable from the stopping point.
Let's draw a slightly more complicated picture, where you've made one new commit on your master
—we'll call this E
—and brought in D
from bean
to make bean/master
point to it:
E <-- master
/
A--B--C
\
D <-- bean/master
Now master
is one commit ahead of bean/master
, and bean/master
is one commit ahead of master
, at the same time. This is because if we start from master
and work backwards, we find one commit that we cannot reach by starting at bean/master
and working backwards. The same is true if we start the other way around.
Hence, we need two git rev-list
commands. We would run one with master ^bean/master
aka bean/master..master
: start with master
, stop when reaching commit C
because it's reachable from bean/master
. That counts commit E
on master
, and stops. The other, we will use bean/master ^master
aka master..bean/master
: start with bean/master
, and stop when reaching commit C
because it's reachable from master
.
This reachability concept is one of the key graph-theory bits that makes Git work. A good way to visualize it is to imagine coloring each commit temporarily, as with a highlighter pen: we color some commits red as "stop" and others green as "go", and red tends to override green, if we're doing both colors "at the same time". The X ^Y
notation means use green starting from X
and red starting from Y
. The Y..X
notation is just shorthand for the same thing.
Symmetric difference makes this a bit easier
As it turns out, Git has a special notation, X...Y
(three dots instead of two), that denotes a symmetric difference: color commits green if reachable from only one of the starting points, but red if reachable from both. In this graph, bean/master...master
would select commit E
—reachable from master
but not bean/master
—and commit D
, but would reject commits C
and earlier.
That doesn't necessarily seem all that useful here, until you find that git rev-list
has a --left-right
option as well. When using this option with the symmetric difference three-dot syntax, Git will note which commits came from the "left name" (bean/master
) and which came from the "right name" (master
). Normally, when git rev-list
is spitting out commit hash IDs, it uses <
and >
to mark these. But if you add --count
, Git just counts them as usual, then prints two numbers:
git rev-list --count bean/master...master
The number on the left is the number of commits reachable from bean/master
but not from master
, and the number on the right is the number of commits reachable from master
but not from bean/master
.
And—aha!—these are exactly the counts that git status
prints for "behind" and "ahead". (Swap the names to get them in the other order, if you prefer.)
The caveat: unrelated branches
You can, if you use unrelated repositories or git checkout --orphan
, create a repository with disjoint subgraphs within it:
A--B--C <-- master
D--E <-- unrelated/master
The symmetric difference notation will, in this case, list or count all the commits on both branches, since all it does is list or count commits reachable from either name but not from both. Since the parent chains never come together, no commit is reachable from both.
You can detect this situation if you really need to—when it occurs there is no merge base between the two names—but it generally should not happen in the first place. Note that having multiple roots is not a guarantee, since we can deliberately merge unrelated histories:
A--B
\
E--F <-- branch
/
C--D
and we can even have a fork after the merge:
A--B G <-- br1
\ /
E--F
/ \
C--D H--I <-- br2
but these branches do have a merge base commit (it's commit F
, which is obvious from the graph).