The git stash
command violates some of the normal generally-applicable rules about Git, but in ways that eventually turn out to be mostly unsurprising. Let's take a little side trip to review things you probably already know, but may not have actually recognized as significant. After that, let's look at what git stash
does.
Git is mainly about commits
The first thing you need to know about Git is that it's mostly about commits, which are identified by hash IDs. These hash IDs, which you will see in git log
output, are useless to mere humans because there's no way to keep them straight. So Git augments them with names like master
—which is a branch name—or v1.2
, a tag name; or origin/feature
, which is a remote-tracking name.1 Each such name stores one (and only 1) hash ID. We say that these names point to a commit:
a123456 <-- master
Each commit also stores a hash ID, which is the commit's parent. This makes the child commit "point back to" its parent. This lets Git follow a chain: start from the most recent commit—the one to which master
points, for instance—and do something with it; then do the same thing with that commit's parent; then do it again with the parent's parent (the grandparent), and so on. That's what git log
does, for instance. So really, we have:
... <-3c39aef <-a123456 <-- master
The name master
leads to commit a123456
, and then commit a123456
leads back to some earlier commit, and so on. Git, in other words, works backwards.
1These are also called remote-tracking branch names. I don't like the phrase remote-tracking branch as the word "branch" is already pretty heavily loaded. (The word "track" is overloaded too! At least the word remote is usually used pretty consistently.) These names just remember the hash ID stored under a branch name in some other repository.
The index
The process of making a new commit, in Git, is straightforward, but at first surprising to anyone who has used almost any other version control system. When you run git commit
, Git takes whatever is currently in your index and uses that to make a new commit. Since this new commit is new, it gets a new, unique hash ID, different from every existing commit's hash ID. The new commit holds a snapshot of whatever was in the index. You are, of course, the author of the new commit; and the new commit's parent commit is the commit you had checked out just a moment ago, before you made the new one.
In other words, if, just a moment ago, a123456
was the current commit, and you ran git commit
, now a123456
is the parent of the new current commit, with its new hash ID. Let's assume that the new hash ID is b789abc
:
... <-a123456 <-b789abc <-- master
That Git uses this thing called the index is the first surprising part. Most version control systems have something like it, but keep it hidden; but Git requires that you know about the index. Meanwhile, the fact that the ID to which master
points changes is the second surprising part.
Branch names
Note that the branch name master
doesn't really know anything about the branch itself. All it does is store the one hash ID! Well, there's one more feature: it automatically changes as you make new commits. That's what's special about branch names. They automatically change when you commit.
To choose which branch name should change, you run git checkout branch
. This also chooses which commit you have checked out. The commit you have checked out right now is the same one the name identifies. This is true at all times, by definition: if you have a branch checked out, the branch name "points to" (has as its value) the commit hash ID, and that's the commit you have checked out.
The work-tree, and tracked, untracked, and ignored files
All files stored inside Git—in the repository, or even just in the index—are in special, often highly-compressed forms. The Git commands are often the only things on your computer that can deal with these compressed files. So Git needs a way to have copies of these files that are in their ordinary form, so that you can work with (or on) them. That's the work-tree.
Git also has this general idea of untracked files. The index gets right in your face here, even though you can't see it: Git defines a tracked file as one that's in the index. Git defines an untracked file as one that's not in the index. Specifically, it's a file that is in the work-tree, but not in the index.
Since git commit
makes new commits out of whatever is in the index, we can see from this definition that whatever's not tracked does not go into a new commit. (This is going to be highly relevant to git stash
.)
When you first git checkout
any existing commit, usually by using git checkout
on a branch name, Git generally fills the index with whatever is in that commit. So now the index matches the commit. Git then extracts everything from the index into the work-tree. So now the work-tree matches the index, which matches the current commit. In other words, you have three copies of every file.
When you run git add filename
, what you are really doing is telling Git: copy filename
from the work-tree into the index. If the file was already in the index, you are simply replacing the data, updating it to whatever you put in the work-tree. If the file was not in the index before, well, now it is, and now it is tracked. It's not committed yet—you've merely copied it into the index. But now it's in the index, so it's tracked.
When Git comes across an untracked file—a file that's in the work-tree, but not in the index—Git tends to complain about it. Git can be very whiny and noisy. So .gitignore
lets you tell Git: Hey, you know this file that's untracked? It's supposed to be untracked. Shut up about that already! This also tells Git not to start tracking the file—i.e., not to add it to the index—if you use an en-masse git add .
or git add --all
. It never takes the file out of the index if it's already there, though, so listing a file in .gitignore
never gets rid of the file. It only has an effect on untracked files.
In any case, once you have used git add
to copy files into the index, you can make a new commit that saves that version of that file forever, under the hash ID that Git assigns to the new commit. If you run git commit
to make the new commit, the new commit gets added at the end of the current branch, because Git assigns the new commit into the branch name.
Git stash
So, finally, we can look at what git stash
does. (Note: I'm going to ignore git stash -u
and git stash -a
here, and just cover "normal" stashes.) What git stash save
actually does is to make two commits. There's something a little bit unusual about the process, but Git uses the same commit mechanisms as before.
The first of these two commits saves the index. Since Git is built around making commits from the index, this is actually the easy part. Let's say you're on a commit whose hash ID is just H
:
...--G--H <-- master
The funny thing about git stash
's first commit is that it doesn't go on branch master
at all. In fact, it goes onto no branch. Let's represent this index commit with the letter i
, in lowercase:
...--G--H <-- master
|
i
Then, git stash
goes on to make a second commit. This one is harder to make, and what it does is use a spare, extra, temporary index to make it. Git fills the temporary index in by copying, into the new temporary index, everything from the normal index but updated according to whatever files are in the work-tree. That is, the new temporary index starts out as a copy of the original index, then has work-tree files added to it (or removed from it if they're missing). Now Git makes that second commit, but makes it with not one but two parents. Let's call this second commit w
for "work-tree":
...--G--H <-- master
|\
i-w
The last step of git stash
is to set up a name to remember the hash ID of this w
commit. Git uses the special name refs/stash
(which is not a branch name):
...--G--H <-- master
|\
i-w <-- refs/stash
Note that untracked files appear in neither commit: neither i
nor w
have any untracked files. If you did run git add
on some files, though, those files are in both i
and w
(and the contents in the i
and w
commits match, unless you changed the work-tree copy again after git add
ing it to the index).
Later, when you git stash apply
or git stash pop
the i-w
pair, Git will extract whatever files are in those commits, compare them to what's in the H
commit, and use that to build the changes in your index and/or work-tree.
Conclusion
So, this is what you should know about git stash
:
- It makes commits.
- The commits it makes are really just like any other commits you could make, except that they're not on any branch.
- Because they're on no branch, they won't be copied by
git rebase
(which is mostly about copying old commits to newer, presumably improved, commits).
- What's in those commits is whatever was in the index when you ran
git stash
, which is the same as the rule for any ordinary commits you make.
- You can always just make ordinary commits instead of using
git stash
. Unless you're pretty sure you have a quick and easy case, that's often a better way to work anyway.
There are some more advanced things to know, such as the use of -u
(aka --include-untracked
) and -a
(aka --all
) and -k
(aka --keep-index
), but those are tricky to deal with correctly. I have seen a lot of cases of people using these without understanding them, where they get into trouble later. (In particular, extracting -u
and -a
stashes can be problematic. The -k
option, which I think is primarily meant for testing, is also a bit tricky to use in any automated way.)