Before I dive into the below (which you should read and learn about because you will need it eventually), I want to mention that you can do this a much easier way, with git checkout -b <commit-id>
. This has the effect of checking out an existing (older) commit and making a new branch name that points to that commit. It's like doing the git branch
and then the git checkout
and then the git reset
below, all in one step.
Using git reset
What git reset
does is a little bit hard to explain, but easy to illustrate, especially if we have colors ... which we don't. So, let's do this in multiple parts instead.
Drawing the DAG
First, we have the actual set of commits inside the repository. These commits form a graph—specifically a Directed Acyclic Graph or DAG, though right now all we need is the name, and "DAG" is a nice short name. The main thing to know here is that each commit "points to" its parent commit(s), and these commits are quite solid and permanent. No matter what else we do, these commits will remain in the graph, at least for some time. We'll get to the exception (the "some time" part) later.
If we draw this with earlier commits towards the left and later (newer) ones towards the right, it looks something like this:
o <- o
/
o <- o
\
o <- o
(the slash /
is standing in for a down-and-left arrow, which is not available in all fonts; the backslash \
stands in for an up-and-left arrow). Each o
node represents a commit and the arrows represent the parent pointers. This particular graph has six commits and no merges.
Each of these commits has a unique SHA-1 hash name, like a123456...
. Those names never change: each name is specific to its one particular commit. You can use these names any time, but of course they are difficult to enter. (Sometimes I use the mouse to copy-and-paste these long strings of numbers.)
Labeling the DAG
Since we know that earlier commits are leftward and later ones are rightward and commits point to their parents, I normally draw this more compactly. Also, we would generally like to know which branches these commits are on, so let's add some branch names:
master
|
v
o--o
/
o--o
\
o--o
^
|
develop
This is the same six commits, but now we know which branches they're on: the first two commits—at the left, and in the middle vertically—are on both branches, then there are two commits that are only on master
(along the top) and two commits that are only on develop
(on the bottom). Note that the branch names point to the tip-most commit here.
Moving branch names around
The next thing to know about branch labels is that they move. The commits are solid and permanent, but the labels are like little birds, flitting from commit to commit. Let's make the graph a little bit bigger by adding one more commit to develop
, and see what happens to the label:
master
|
v
o--o
/
o--o
\
o--o--o
^
|
develop
There we go: it moved! It still points to the tip commit, but that tip commit is the one we just added.
Adding new labels (new branches)
Now let's add a new label—a new branch—to this graph. I'll point the new label to the tip of develop
as well. To make room, I'll draw the existing labels to the right, instead of above and underneath, but remember that these are the moveable labels.
o--o <-- master
/
o--o
\
o--o--o <-- develop
^
|
new
Using git reset
Now, assuming I have done git checkout new
so that I'm on branch new
, let's see what git reset
does to the label. Specifically, let's take an old commit and get its number, e.g., fdeee3df9f54372c31506eb24f2b7f2339ba21ec
(this particular number is Git release 2.8.1):
$ git branch new
$ git checkout new
Switched to branch 'new'
$ git reset --hard fdeee3df9f54372c31506eb24f2b7f2339ba21ec
HEAD is now at d95553a Git 2.8.1
The graph for the repository for Git itself is much too big to draw, but suppose I did this in a smaller repository like the one that now has just seven commits (of course the hash would no longer be fdeee3d...
). Then I might now have this:
o--o <-- master
/
o--o
\
o--o--o <-- develop
^
|
new
(assuming I have given Git the right hash). If I give Git the hash of the very first commit in the repository, I get this:
o--o <-- master
/
o--o
\
^ o--o--o <-- develop
|
new
git reset
moves the branch label
The point here is that what git reset
is doing is moving the branch label.
Since I'm on branch new
, the label that git reset
moves is new
.
Be a little bit careful with git reset
What if I were to git checkout develop
now, though, and git reset
it to point back one step? That is, say I made the graph look like this:
o--o <-- master
/
o--o
\
^ o--o--o
| ^
new develop
Notice how there's no longer any arrow pointing to what used to be the tip-most commit of branch develop
?
When this happens, that commit is now "abandoned" or "unprotected" (the more precise term is unreferenced).1 An unprotected commit becomes eligible for garbage collection. The git gc
command (which is invoked automatically for you as needed, so you don't normally ever have to run it) will find these abandoned leftovers and recycle them to get your disk space back.
Eventually, then, after git reset
-ing away the extra commit on develop
, Git will really remove it, and we'll have this:
o--o <-- master
/
o--o
\
^ o--o
| ^
new develop
That is, we'll be back to just the six commits.
A branch name protects the tip commit of its branch, of course—but it also protects all the commits that are not at the tip, because in Git, we can always follow the arrows, and commits point back (leftward, in these drawings) to their parent commits. Any commit we can reach by starting with a name, following an arrow to a commit, and following the commits' arrows to other commits—all those commits are reachable and therefore protected and remain in the DAG forever.
You must therefore be at least a little bit careful with git reset
, to make sure you have some label—some branch name, usually—still pointing to the commits you want to keep. Making a new branch name, and moving that around, guarantees that you're OK.
Be more careful with git reset --hard
While all git reset
s move branch labels, the --hard
reset does something else too: it erases work in your index/staging-area and in your work-tree, by resetting those to the same commit to which you're having git reset
move the branch label. Often enough, as in your case, this is just what you want anyway.
In fact, sometimes you don't want to move the branch label, but you do want to reset your index and work-tree. In this case, you can just run git reset --hard
without naming a specific commit. Git will "move" the branch label from its old commit to ... wherever it is now. That is, it actually stays right where it is. Then Git will go on to reset the index and work-tree, i.e., wipe out whatever changes you have. If you were working on the code for a while and have decided to start over, this is what you want: "set everything back to the way it is in the most recent commit." Using git reset --hard
will do that. It still moves the branch label, but moving it from where it is, to where it is, means that you only see the "reset the index and work-tree" effect.
1Fortunately, Git normally keeps every normal commit semi-protected for at least 30 days, using what Git calls reflogs. Each branch has its own reflog, and there's one big reflog for HEAD
as well, and these remember commit IDs for 30 to 90 days by default. As long as the reflog is remembering—referencing—the commit ID, the commit is protected from garbage collection.