How do you programmatically check if the local copy is behind the remote?

Question

Currently I'm fetching the latest and then running git status and parse the output for Your branch is up to date with 'origin/master' but that feels like a hack.

I've looked into using git status --porcelain but that only includes file changes made on the system, not changes made on remote. I don't care about what changes have actually been made, I just want to know if any changes exist at all (either on local or remote).

How would i achieve this cleanly?

`git fetch; git diff ..origin/master` Does this not do the trick? This list the changes in remote and your local commits. If you don't get anything you should be up to date with `origin/master`. — Saugat, Aug 14 '18 at 11:45
@OliverRadini Not necessarily. I just don't wont to have to rely on human readable output. Since its syntax can change without warning with an update to git. — Olian04, Aug 14 '18 at 11:48
So then, you'd like to be able to write a program that will be able to parse the status of differences as a boolean or something? — OliverRadini, Aug 14 '18 at 11:49
@OliverRadini I just need a set of git commands to get the job done. Writing a program to call them is not a problem. — Olian04, Aug 14 '18 at 11:51
I suppose my point is that if your program just runs git commands, it'll need to parse the output of status anyway so would be relying on a human readable form. If you'd like to write a program to do it then there may be api's, but that's a little language specific so perhaps should be included in the question — OliverRadini, Aug 14 '18 at 11:53
@SaugatAcharya Now I feel silly, that works like a charm. Could you explain the `..`? — Olian04, Aug 14 '18 at 11:53
Possible duplicate of [Git: How to check if a local repo is up to date?](https://stackoverflow.com/questions/7938723/git-how-to-check-if-a-local-repo-is-up-to-date) — Liam, Aug 14 '18 at 11:54
@Olian04 It is just a convention. You should be able to do `git diff ..` which results in the same output. — Saugat, Aug 14 '18 at 12:19
@Olian04 Without `..` you'll see the difference in your local branch as compared to `origin/master` i.e. `git diff ..`. Basically the colors revert, you can try it out yourself. — Saugat, Aug 14 '18 at 12:27

score 7 · Accepted Answer · answered Aug 14 '18 at 19:03

To get, programmatically, a count of commits that are different on the current branch vs its upstream, use git rev-list --count --left-right HEAD...@{upstream}, or git rev-list --count master...master@{upstream} for instance. Note the three dots here, which separate the branch name or HEAD from branch@{upstream}This is how git status or git branch -vv prints ahead 1 or behind 2 or up to date or whatever.

Note that this assumes that you are on a branch in the first place, and that the branch has an upstream to be ahead and/or behind. If the upstream is a remote-tracking name like origin/master, this assumes that the value stored in the remote-tracking name is the one you want stored in it.

There is a lot more to know

If you are scripting this stuff, it's important to know (or define) precisely what you mean by up to date.

Purely locally—i.e., within one repository + work-tree combination—there are three entities to think about:

The current commit, aka HEAD.

This may be a detached HEAD, where HEAD contains a raw hash ID, or the opposite, on a branch, where HEAD contains the name of the branch itself. When on a branch, the branch name, e.g. master, contains the raw hash ID of the current commit. Either way, HEAD always refers to the current commit.¹

The current commit itself is read-only (entirely) and permanent (mostly—you can deliberately abandon commits, after which they eventually get removed). You can change which commit is the current commit (e.g., git checkout different-commit), but you cannot change the commits themselves. Since the commit cannot change, it's never "out of date" by definition: it is whatever it is. Like any commit, the current commit has some metadata (who made it, when, etc.) along with a complete snapshot of every file.

Files store inside commits are in a special, Git-only format (and of course are read-only).
The work-tree, which is simply where you do your work.

Here, you can read and write every file. These files are in their ordinary format, not compressed and Git-specific. You can also have files here that are not known to Git, but before we can talk about this properly, we need to cover the third entity.
The index, also called the staging area or sometimes the cache.

The index has several uses (hence the multiple names) but I think it is best described as the next commit you would make, if you made a commit right now. That is, the index (which is actually just a file) holds all the information Git needs to make a new snapshot, to put into a new commit. Hence the index holds all the files that will go into the next commit you make.

Files in the index are compressed, and in a Git-only format, just like files in commits. The crucial difference for our purposes here, though, is that the files in the index can be changed. You can put new files into the index, or remove existing files from the index, as well.

All that git add file really does is to copy a file from the work-tree, into the index. This replaces the previous version in the index, so that the index now matches the work-tree. Or, if you wish to remove a file, git rm file removes that file from both the index and the work-tree.

¹A new repository has no commits at all, so there is an exception to this rule: HEAD can refer to a branch name that simply does not yet exist. That's the case in a brand new repository: HEAD says that the current branch is master, yet master does not actually exist until you make the first commit.

(The git checkout --orphan command can re-create this special "on a branch that does not exist yet" state for another branch. This is not something most people will do most of the time, but it can come up in programs that examine the state.)

What `git status` does

Since the index and work-tree are both writable, both can be "dirty" or cause something to be "out of date" in some way. If you consider the work-tree file to be the newest, it may be the index copy that's out of date, because it does not match the work-tree copy. Once the work-tree file is copied into the index, the index no longer matches the HEAD commit, and a new commit will be needed at some point.

What git status does, besides running git rev-list --count --left-right with the branch and its upstream and getting those numbers,² is that it runs, in effect, two git diffs (with --name-status since it's not interested in a detailed patch):

Compare HEAD to index. Whatever is different here, these are the changes that are staged for commit, because if you made a commit now, Git would snapshot the entire index, and that snapshot would differ from the current commit in precisely these files.
Compare index to work-tree. Whatever is different here, these are the changes that are not staged for commit. Once you run git add on these files, the index copy will match the work-tree copy, but no longer match the HEAD copy, so now those will be changes that are staged for commit.

²Note that git status first checks that you're on a branch, and if so, that the branch has an upstream setting. Also, this is all built into it, so it does not have to run a separate program, but the principle is the same.

Untracked and maybe ignored

We can now properly define what it means for a file to be untracked, too. An untracked file is, quite simply, a file that is not in the index. That is, if we remove a file from the index (only) with git rm --cached, or if we create a file in the work-tree without creating a corresponding file in the index, we have a work-tree file that has nothing of the same name in the index. That's an untracked file.

If a file is untracked, git status normally whines about it: the diff it runs that compares the index to the work-tree says ah, here is a file in the work-tree that is not in the index, and Git would tell you that it is untracked. If it is untracked on purpose, you can have git status shut up about it, by listing that file—or a path-name pattern that matches it—in a .gitignore file. Essentially, just before complaining that some file is untracked, Git looks at the ignore directives.³ But if the file is in the index, Git never looks for its name in any .gitignore.

³The ignore directives also tell git add that any en-masse "add everything" should avoid adding that file, if it's currently untracked.

Upstreams and remotes

An upstream for a branch can be a remote-tracking name, like origin/master. These names are your Git's way of remembering some other Git's branches. To update the remote-tracking names for the remote origin, you simply run git fetch origin.

Note that you can have more than one remote! If you add a second remote fred at some second URL, git fetch fred will call up the Git at that URL, and update your fred/master and so on. So it's important to run git fetch to the right remote.

Running git fetch with no additional name will fetch the remote for the current branch's upstream, or from origin the current branch has no upstream, or there is no current branch, so this is usually just a matter of running git fetch.

Submodules

Submodules are really just references to another Git repository, but this throws a whole new wrinkle into the general plan. Each Git repository has its own HEAD, work-tree, and index. These can be clean or dirty as before, and if the submodule is not in detached-HEAD state, the submodule's branch can be ahead of and/or behind its upstream.

Submodule repositories are, however, normally in detached-HEAD state. Each commit in the superproject lists the specific commit to which your Git should detach that submodule Git. When the superproject Git checks out the commit, the superproject Git stores the hash ID for the submodule into the superproject's index. That way each new superproject commit records the correct hash ID.

To change the hash ID, git add in the superproject copies the current hash ID of the actual checked-out submodule, into the index in the repository for the superproject (whew!). So if you've moved the submodule (via git checkout there), you navigate back to the superproject, run git add on the submodule path, and now the superproject's index records the correct hash ID, ready for the next superproject commit.

(Testing whether the submodule is on the commit desired by the superproject's index is more difficult.)

how would you test in a script if the local branch is behind remote? i.e. not get a cound and exit code 0, but get 0 if not, and not 0 if it is behind or before? — soloturn, Feb 04 '23 at 22:42