To get, programmatically, a count of commits that are different on the current branch vs its upstream, use git rev-list --count --left-right HEAD...@{upstream}
, or git rev-list --count master...master@{upstream}
for instance. Note the three dots here, which separate the branch name or HEAD
from branch@{upstream}
This is how git status
or git branch -vv
prints ahead 1
or behind 2
or up to date
or whatever.
Note that this assumes that you are on a branch in the first place, and that the branch has an upstream to be ahead and/or behind. If the upstream is a remote-tracking name like origin/master
, this assumes that the value stored in the remote-tracking name is the one you want stored in it.
There is a lot more to know
If you are scripting this stuff, it's important to know (or define) precisely what you mean by up to date.
Purely locally—i.e., within one repository + work-tree combination—there are three entities to think about:
The current commit, aka HEAD
.
This may be a detached HEAD, where HEAD
contains a raw hash ID, or the opposite, on a branch, where HEAD
contains the name of the branch itself. When on a branch, the branch name, e.g. master
, contains the raw hash ID of the current commit. Either way, HEAD
always refers to the current commit.1
The current commit itself is read-only (entirely) and permanent (mostly—you can deliberately abandon commits, after which they eventually get removed). You can change which commit is the current commit (e.g., git checkout different-commit
), but you cannot change the commits themselves. Since the commit cannot change, it's never "out of date" by definition: it is whatever it is. Like any commit, the current commit has some metadata (who made it, when, etc.) along with a complete snapshot of every file.
Files store inside commits are in a special, Git-only format (and of course are read-only).
The work-tree, which is simply where you do your work.
Here, you can read and write every file. These files are in their ordinary format, not compressed and Git-specific. You can also have files here that are not known to Git, but before we can talk about this properly, we need to cover the third entity.
The index, also called the staging area or sometimes the cache.
The index has several uses (hence the multiple names) but I think it is best described as the next commit you would make, if you made a commit right now. That is, the index (which is actually just a file) holds all the information Git needs to make a new snapshot, to put into a new commit. Hence the index holds all the files that will go into the next commit you make.
Files in the index are compressed, and in a Git-only format, just like files in commits. The crucial difference for our purposes here, though, is that the files in the index can be changed. You can put new files into the index, or remove existing files from the index, as well.
All that git add file
really does is to copy a file from the work-tree, into the index. This replaces the previous version in the index, so that the index now matches the work-tree. Or, if you wish to remove a file, git rm file
removes that file from both the index and the work-tree.
1A new repository has no commits at all, so there is an exception to this rule: HEAD
can refer to a branch name that simply does not yet exist. That's the case in a brand new repository: HEAD
says that the current branch is master
, yet master
does not actually exist until you make the first commit.
(The git checkout --orphan
command can re-create this special "on a branch that does not exist yet" state for another branch. This is not something most people will do most of the time, but it can come up in programs that examine the state.)
What git status
does
Since the index and work-tree are both writable, both can be "dirty" or cause something to be "out of date" in some way. If you consider the work-tree file to be the newest, it may be the index copy that's out of date, because it does not match the work-tree copy. Once the work-tree file is copied into the index, the index no longer matches the HEAD
commit, and a new commit will be needed at some point.
What git status
does, besides running git rev-list --count --left-right
with the branch and its upstream and getting those numbers,2 is that it runs, in effect, two git diff
s (with --name-status
since it's not interested in a detailed patch):
Compare HEAD
to index. Whatever is different here, these are the changes that are staged for commit, because if you made a commit now, Git would snapshot the entire index, and that snapshot would differ from the current commit in precisely these files.
Compare index to work-tree. Whatever is different here, these are the changes that are not staged for commit. Once you run git add
on these files, the index copy will match the work-tree copy, but no longer match the HEAD
copy, so now those will be changes that are staged for commit.
2Note that git status
first checks that you're on a branch, and if so, that the branch has an upstream setting. Also, this is all built into it, so it does not have to run a separate program, but the principle is the same.
Untracked and maybe ignored
We can now properly define what it means for a file to be untracked, too. An untracked file is, quite simply, a file that is not in the index. That is, if we remove a file from the index (only) with git rm --cached
, or if we create a file in the work-tree without creating a corresponding file in the index, we have a work-tree file that has nothing of the same name in the index. That's an untracked file.
If a file is untracked, git status
normally whines about it: the diff it runs that compares the index to the work-tree says ah, here is a file in the work-tree that is not in the index, and Git would tell you that it is untracked. If it is untracked on purpose, you can have git status
shut up about it, by listing that file—or a path-name pattern that matches it—in a .gitignore
file. Essentially, just before complaining that some file is untracked, Git looks at the ignore directives.3 But if the file is in the index, Git never looks for its name in any .gitignore
.
3The ignore directives also tell git add
that any en-masse "add everything" should avoid adding that file, if it's currently untracked.
Upstreams and remotes
An upstream for a branch can be a remote-tracking name, like origin/master
. These names are your Git's way of remembering some other Git's branches. To update the remote-tracking names for the remote origin
, you simply run git fetch origin
.
Note that you can have more than one remote! If you add a second remote fred
at some second URL, git fetch fred
will call up the Git at that URL, and update your fred/master
and so on. So it's important to run git fetch
to the right remote.
Running git fetch
with no additional name will fetch the remote for the current branch's upstream, or from origin
the current branch has no upstream, or there is no current branch, so this is usually just a matter of running git fetch
.
Submodules
Submodules are really just references to another Git repository, but this throws a whole new wrinkle into the general plan. Each Git repository has its own HEAD
, work-tree, and index. These can be clean or dirty as before, and if the submodule is not in detached-HEAD state, the submodule's branch can be ahead of and/or behind its upstream.
Submodule repositories are, however, normally in detached-HEAD state. Each commit in the superproject lists the specific commit to which your Git should detach that submodule Git. When the superproject Git checks out the commit, the superproject Git stores the hash ID for the submodule into the superproject's index. That way each new superproject commit records the correct hash ID.
To change the hash ID, git add
in the superproject copies the current hash ID of the actual checked-out submodule, into the index in the repository for the superproject (whew!). So if you've moved the submodule (via git checkout
there), you navigate back to the superproject, run git add
on the submodule path, and now the superproject's index records the correct hash ID, ready for the next superproject commit.
(Testing whether the submodule is on the commit desired by the superproject's index is more difficult.)