Can someone explain why this is happening?
Yes: it's the result of trying to treat a complicated (well, not that complicated) graph as if it were linear.
The setup
Every commit has a hash ID, as you have seen. This hash ID is how you can specify that particular commit, without worrying about how you get to it.
Every commit also saves / stores / provides the hash ID of its parent commit. Well, that is, it has one saved hash ID if it has one parent. A merge commit has two parents, so it provides two different hash IDs. ("A commit with two or more parents" is the actual definition of a merge commit. The other special case, which does not come up here, is a commit with no parent. Such a commit is called a root commit, and you get one of these for the very first commit you make in a new repository.)
Think of the act of storing a hash ID as keeping an arrow that points to another commit. If we have several ordinary (non-merge, non-root) commits that keep arrows to their parent commit, we can draw them using these arrows. To keep the drawing manageable, we can use one uppercase letter instead of a 40-character unpronounceable hash ID:
... <--E <--F <--G
Note that these arrows are attached to the child commit, not the parent, but since nothing about any commit can ever be changed, we can just remember that they point backwards like this, and draw them as connecting lines, to simplify the drawing:
...--E--F--G
We need a way to find the last commit of a branch, and that's where branch names like master
and develop
come in. In Git, the branch name stores the hash ID of the last commit on the branch—the tip of the branch, as Git calls it—and we can draw that as a name with another arrow:
...--E--F--G <-- master
Whenever we're on a branch and make a new commit, Git adds the commit to the branch by making the new commit have, as its parent, the current tip:
...--E--F--G <-- master
\
H
and then changing the branch name so that it points to the new commit we just made (and now we don't have to draw it on a separate line to leave room for the old master
arrow):
...--E--F--G--H <-- master
If you have two names pointing to the same commit, only one of them can be the current branch. We'll add (HEAD)
to remember which one is current:
...--E--F--G <-- develop (HEAD), master
Then we'll add a new commit:
...--E--F--G <-- master
\
H <-- develop (HEAD)
Now we'll check out master
and add a new commit there too:
...--E--F--G--I <-- master (HEAD)
\
H <-- develop
The merge
Now let's make a new merge commit, by doing:
git checkout develop
which moves our HEAD:
...--E--F--G--I <-- master
\
H <-- develop (HEAD)
and then:
git merge master
(this may be kind of the wrong way around in general—many people prefer to merge from feature branches, into their more-masterful branches, but that's an entirely different question). To do the merge, Git:
- locates the current commit (
HEAD
/ commit H
);
- locates the other commit (
master
, commit I
);
- the "best" (nearest to the two tips) commit that's on both branches (commit
G
).
The last of these is the merge base for the merge. Git will then compare the merge base to the current commit, to see what we did:
git diff --find-renames <hash-of-G> <hash-of-H>
and then run a second comparison to see what they did:
git diff --find-renames <hash-of-G> <hash-of-I>
Git combines these changes, as best it can, applying both sets of changes to the merge base contents. If all goes well (or if we accidentally resolve the merge without resolving anything), we will get a merge commit, with two parents. The first parent will be the commit we're on right now, H
; the second parent will be the other commit, I
. Git will make our current branch name point to the new commit M
(for merge):
...--E--F--G--I <-- master
\ \
H--M <-- develop (HEAD)
Although we can't really draw it well, even like this, this first and second parent notion is quite important; we'll see it again in a moment.
The revert
As you found out, git revert
simply adds a new commit whose effect is to back out some previous commit's changes. Let's draw that commit in now, as commit R
for revert:
...--E--F--G--I <-- master
\ \
H--M--R <-- develop (HEAD)
Git has to show you commits one at a time
When you run git log
, Git will start from your current commit. Our HEAD
is attached to our develop
so that means Git follows the arrow from develop
to commit R
. Git shows us commit R
, then moves on to R
's parent, M
.
Git now shows us M
(and git log
won't show a diff here even with -p
since there are two parents it could diff against). Once it has done that, git log
needs to show us M
's parent ... but wait! There are two! Which one should it show?
What Git does at this point is to put both commits into a queue of "commits to show". It then picks one out of the queue, and shows it. The one it picks depends on the sorting options you choose when you run git log
. (The default is to sort by committer-timestamp.) In your case, the one it picked was I
, the tip of master
. That puts I
's parent G
into the queue.
Git once again has two commits to choose from (G
and H
). It picks one, shows it to you, and puts that commit's parent into the queue. In our case, if it picks H
, it puts G
into the queue, but G
is already in the queue, so the queue is now down to just one commit—and at this point the behavior becomes simple again (show G
, then F
, and so on).
In any case, git log
has shown you this linearized view of something that's inherently not linear: the parents of merge M
could come out in either order, and you can give git log
sorting options that might change that order.
The suffix hat/caret (HEAD^
) notation
While you can name a commit by its raw hash ID, this is often rather unpleasant to type. You can shorten them, but even then it's a bit tricky. You can use a mouse and cut-and-paste them, which is better. But there are many alternatives, all outlined in the gitrevisions documentation. One of these is the ^
suffix.
When you use the hat-suffix on a commit specifier, you are telling Git: Look up the commit I gave you, then find its parent. You can add a number after the hat, and if you do, you are telling Git: Look up the commit, then find the n'th parent. Most commits only have one parent, so only ^1
makes any sense, and you can omit the digit.
Hence, if HEAD
is attached to develop
, and develop
names commit R
, the string HEAD
means commit R
, but the string HEAD^
means the first parent of R
, which is of course M
. Note that you can write HEAD^1
, if you like, to say the first parent of R
, but of course there's just the one parent anyway.
With a merge, which has at least two parents, you can meaningfully select one of the two parents. We could therefore write HEAD^1^2
to mean: Starting from R
, find its first parent, then find that commit's second parent. That would step from R
to M
and then from M
to I
.
You didn't use that, though; you used HEAD^1^1
, spelled the simpler way, HEAD^^
. That tells Git: Starting from R
, find its first parent M
, then find its first parent H
. So this names commit H
, as if you'd entered the hash of commit H
on the command line.
When you run git reset --hard <commit-specifier>
, you are telling Git to, first, resolve the <commit-specifier>
part to a hash ID, and then, having located the commit, change the current branch name—the one HEAD
is attached to—so that it points to that commit.
(Because Git always works backwards, commits after the chosen point become hard to find, unless you still have some other name you kept that lets you find them. It's always easy to work backwards: the hat suffix does that, for instance, and git log
does that too. But Git literally can't go forwards unless it has first gone backwards and has remembered how it got there. You may eventually see this when you use some of the more complicated options to git rev-list
.)
In your case, the commit you labeled a
is equivalent to our R
here, the one you labeled b
is equivalent to our M
here, and b
's first parent must be the commit you labeled e
:
...--e-----b--a
/
...--d--c
so that a^
is b
and a^^
is e
.
Note that you can select commit d
here with a^^2^
: move from a
to its first parent b
, move from there to its second parent c
, and move from there to its first parent d
.
The tilde suffix
This also applies to git reset --hard HEAD~1
The tilde suffix is part of why this first parent notation is so important (for the rest of it, see git log --first-parent
). A tilde, like a hat/caret, can be followed by a number. This number is, in effect, the number of times to repeat the ^1
operation. Given:
...--e-----b--a
/
...--d--c
the name a~2
means a^^
, which starts at a
and moves back first-parent twice, to e
. You cannot get to commits c
or d
this way as they're not along the first-parent chain; but a~3
would find the first (and probably only) parent of e
, whatever that may be, and a~4
would find its parent, and so on.