All your arrows and branch labels are misleading, because they are all quite sensible. Git, however, works backwards. :-) Let's draw them the other way, the way Git does them:
A <-B <-C <-E <-- origin/master
\
D <-- foo
That is, the name origin/master
contains the hash ID of commit E
. Commit E
contains the hash ID of commit C
, which contains the hash ID of commit B
, which contains the hash ID of commit A
. Commit A
has no other hash ID, because it's the first commit and can't have a parent, so it points nowhere.
All "interior" arrows point backwards. They have to, because commits, like all Git objects, are read-only once created. We know, when we create a new commit, what its parent hash ID is, but we don't know, when we create it, what it's child or children will be, if and when they are ever created. As a result there's no need to draw in the parent arrows themselves; we can just connect the commits, remembering that they point backwards.
Branch names, on the other hand, move all the time. So it's a good idea to keep the branch-name arrows. Let's add in the name master
and an arrow, and note that master
is the current branch (HEAD)
as well:
E <-- origin/master
/
A--B--C <-- master (HEAD)
\
D <-- foo
git pull origin/master
This isn't quite a valid Git command. The actual command is the peculiarly-spelled git pull origin master
.
If you are a newcomer to Git, I recommend avoiding git pull
entirely for a while. I think it mostly confuses people. All it really does is run two other Git commands for you: git fetch
(passing on the rest of the arguments you gave it, if any, or a remote-name and branch-name it extracts from the current branch if not), followed by (normally) git merge
.
Once you are familiar with the other Git commands and know what to expect from them, you can start using git pull
as a convenience, provided that you find it convenient (sometimes it is, sometimes it's not).
Let's look instead/first at git fetch
. What git fetch
does is call up another Git and ask it about its branches and tags.
This second Git has its own, independent master
. Your Git finds out which commit hash their Git is identifying by their master
. Your Git then obtains that commit by its hash ID. The hash ID is the "true name" of the commit—a name like master
is just a moveable pointer, containing a hash ID, and the hash ID that your master
, or their master
, has, changes over time.
If their master
names commit E
, and you already have commit E
, your Git does not have to download commit E
. Your Git simply changes your own origin/master
to point to commit E
(which is no change at all, if it already points there).
If you don't have commit E
yet, your Git gets it from their Git. (Along with commit E
, your Git gets anything they have, that you need, that you don't already have—such as commits C
, B
, and/or A
and/or all the tree and blob objects any of those might need. You will usually have most of these already, but whatever you don't have, they will package up and ship to you, so that your Git can set your origin/master
.)
If their master
names some other commit (any of A
through D
, or some commit you don't have yet), your Git will download whatever it needs so that it has that commit and all its auxiliary data and other reachable commits, then make your origin/master
point to that commit by its hash ID. I'll assume for now that their master
still points to E
, though.
That's the end of all the work for git fetch
: it obtains the various objects, and then updates your remote-tracking names (your origin/*
names). Well, there's one more thing it does, of historical interest: it writes every name it fetched to .git/FETCH_HEAD
. If you run git fetch
, it will default to fetching all the branch and tag names from origin
; if you run git fetch origin master
, you tell it to fetch only one name, the one matching master
(hence branch master), from the other Git that you call origin
.
git fetch && git merge origin/master
After running git fetch origin master
, git pull origin master
will, in effect, run git merge origin/master
. It does so via the special FETCH_HEAD
file, rather than by literally running git merge origin/master
—but git pull origin master
and git fetch && git merge origin/master
will, in this case, do the same thing.
Note that git fetch
is the unrestricted form: update all remote-tracking names. If you're not currently on your own master
, or your master
has a different upstream setting, git pull
will run git fetch origin some-other-name
, but git pull origin master
will explicitly run git fetch origin master
. Then it will run git merge
with a hash ID extracted out of .git/FETCH_HEAD
(and a -m
argument as well). So there are a lot of differences here, but most are usually minor, assuming you're on master
with its upstream set to origin/master
.
The git merge
step is a fair bit more complicated. This:
Checks whether the index and current (HEAD) commit match, or if not, whether the merge looks safe. Ideally they should match (you should have run git commit
if not). It's tricky to back out of a failed merge if the index and the HEAD commit don't match (although git merge --abort
will do its best).
Uses the current commit's hash ID and the merge target commit's hash ID to locate two specific commits. Since HEAD
names master
and master
points to C
, the current commit is C
and the target is E
. Git doesn't have a single consistent name for the target commit; I like to call the HEAD commit L for left/local/--ours
and the other one R for right/remote/--theirs
. It won't matter much here, though, as we'll see in a moment.
Computes the merge base of the L and R commits. The merge base is, simply put (somewhat too simply in hard cases), the first place the two branches come together when we start at both L and R and work backwards.
In this case, that's commit L (aka C
) itself!
If there is no common ancestor merge base commit, fail (in modern Git). If the merge base is not one of the two L and R commits, do a true merge. If the common base is R, do nothing: there is nothing to merge. If the merge base is L / HEAD
, do a fast-forward operation if allowed. If not allowed, resort to a true merge.
Since the merge base is L, and you did not say --no-ff
, Git will use the fast-forward operation for this particular merge. The result will be to check out commit E
and move the name master
to point to E
:
E <-- master (HEAD), origin/master
/
A--B--C
\
D <-- foo
Finally:
git reset --soft HEAD~ && git stash save && git fetch && git reset --hard origin/master && git stash pop
This one is much more complex.
A soft reset using HEAD~1
tells Git to:
- Find the current hash ID by reading
.git/HEAD
. This will normally contain a string like ref: refs/heads/master
, which tells Git that the current branch is master
. If you're in "detached HEAD" mode, .git/HEAD
will have a raw hash ID in it, rather than a branch name; this affects step 4 below. Otherwise, read the branch name itself to find the hash ID.
- Read that commit's parent hash ID (
HEAD~
means HEAD~1
which means "one parent back along the first-parent line of ancestry").
- Don't touch the index (
--soft
), and don't touch the work-tree (--soft
or --mixed
).
- Write the new hash back into the current branch. Or, if HEAD is detached, write the new hash directly into
.git/HEAD
.
Since we have not touched the index and work-tree, they remain unchanged, regardless of whether we had a branch name to rewrite in step 4. Assuming that HEAD
names master
, and that the index and work-tree match commit C
(to which master
points), this soft reset will change the name master
to point to commit B
, leaving the index and work-tree matching the contents of commit C
.
Next, git stash save
writes two commits, not on any branch. One contains the contents of the index, and one contains the contents of the work-tree. (It doesn't matter that these two match each other, or that they match commit C
for that matter—that just means that the two new commits use the existing top level tree object from commit C
, which saves space.) The resulting diagram now looks like this:
E <-- origin/master
/
C--D <-- foo
/
A--B <-- master (HEAD)
|\
i-w <-- refs/stash
(I call the i-w
commit clump, to which refs/stash
points, a stash bag, because it hangs off the commit that was current when you ran git stash save
.)
The git fetch
step now does whatever it does, possibly adding more commits and/or moving origin/master
to point somewhere. We'll assume here that it leaves origin/master
pointing to commit E
.
The git reset --hard origin/master
now turns origin/master
into a hash ID. This was step 1 above in our earlier git reset
, but this time we don't read .git/HEAD
, we just read the value of origin/master
:
git rev-parse origin/master
Note that we can do the same to compute HEAD~1
:
git rev-parse HEAD~1
At any time, git rev-parse
can turn a name into a raw hash ID, whenever that's what we need. For git reset
, that's what we need: what commit are we resetting to?
The git reset
now writes that hash ID into master
, and this time, because we used --hard
, writes that commit's tree into the index and updates the work-tree to match. While the index and work-tree are not in the diagram, we now have this:
E <-- master (HEAD), origin/master
/
C--D <-- foo
/
A--B
|\
i-w <-- refs/stash
(we could draw the A-B-C-D
line horizontally here, or go back to having D
down one row except for the refs/stash
in the way).
Last, the git stash pop
takes whatever is in the w
commit and tries to merge it, using git merge-recursive
, with commit B
as the merge base, the current index turned into a tree as the L tree—since we just git reset --hard
to commit E
, that's E
as L—and the saved w
commit as R. This merge may, depending on what has happened since commit B
, see that there is no work to be done, and do nothing.
If it does nothing, or does something and thinks the merge succeeded, it drops the stash:
E <-- master (HEAD), origin/master
/
C--D <-- foo
/
A--B
It does not make any new commit, so the index and/or work-tree may now differ from the snapshot in commit E
, if the merge did some work.
There are a number of important things to note here:
git pull
really is git fetch
followed by a second Git command. The syntax for git pull
is odd, and either of the two sub-commands it runs can fail, although a failure of git fetch
is unlikely (and generally pretty harmless except for stopping the pull). A failure during git merge
is common and requires manual intervention to complete or abort the merge. It's a good idea to know what you are doing here, including whether you're in a git merge
that needs help; and to know that, it's good to run git merge
yourself the first however-many times.
git merge
itself is quite complicated. It can do a fast-forward, which is not a merge at all (and never encounters merge conflicts). It can do nothing at all. Or, it can do a real merge, which can fail with merge conflicts. To find out what it will do, you must find the merge base, which requires looking at the commit graph (git log --graph
). Some of the clicky web interfaces, such as those on GitHub, hide the graph from you, and make it difficult or impossible to tell what will happen.
git stash
is also quite complicated internally. When all goes well, it seems simple, but when it fails, it fails rather spectacularly.
git reset
has too many modes of operation to make it easy to use. With --soft
, --mixed
, and --hard
, it works one way, and the three options just tell it when to stop working: after moving the current branch, or after resetting the index, or after resetting both index and work-tree. With other options, it works another (different) way.
Using git stash
for anything complicated is tricky. All it does is make commits anyway, so if you are doing something complicated, just make a commit that you can see and work with. You can remove it later with git reset
with --soft
or --mixed
.