2

Question at the end. This was originally going to be a post asking for help to resolve this but I figured out how in the process of writing it. git reset --hard [commit_hash] git push -f

With that said, the prior behavior of git is still confusing me so it would be great if anyone could tell my why this issue was happening.


Original Post
I've been looking online and on Stackoverflow and I thought this was the solution How to delete the last n commit on Github and locally?

But I'm seeing some peculiar behavior when I attempt the recommended command and could really use some advice.

Background
2 Branches: m-branch and d-branch
They should have the same log history which is normally handled by an application handling the merges. Because of stupid reasons there was a merge conflict only on d-branch. When I attempted to resolve the merge conflict I foolishly overlooked that my text editor at the time wasn't autosaving like the other application I normally use. This lead me to commit the code with the merge conflict formatting in it and pushing to remote. Due to my git inexperience, I tried a revert thinking that would just move me back to the correct commit hash but made another commit. So d-branch commit log looks like this

a - the revert, develop's HEAD
b - the bad resolved merge conflict
c - master's HEAD
d - old commit
e - old commit

When I execute git reset --hard HEAD^^ it moves the d-branch head to commit e. If I instead do git reset --hard HEAD^ it moves the d-branch head to commit b. Then if I do it again it once more skips to e. This also applies to git reset --hard HEAD~1

Can someone explain why this is happening? Is there any other information I could provide?

Thanks

1 Answers1

2

Can someone explain why this is happening?

Yes: it's the result of trying to treat a complicated (well, not that complicated) graph as if it were linear.

The setup

Every commit has a hash ID, as you have seen. This hash ID is how you can specify that particular commit, without worrying about how you get to it.

Every commit also saves / stores / provides the hash ID of its parent commit. Well, that is, it has one saved hash ID if it has one parent. A merge commit has two parents, so it provides two different hash IDs. ("A commit with two or more parents" is the actual definition of a merge commit. The other special case, which does not come up here, is a commit with no parent. Such a commit is called a root commit, and you get one of these for the very first commit you make in a new repository.)

Think of the act of storing a hash ID as keeping an arrow that points to another commit. If we have several ordinary (non-merge, non-root) commits that keep arrows to their parent commit, we can draw them using these arrows. To keep the drawing manageable, we can use one uppercase letter instead of a 40-character unpronounceable hash ID:

... <--E <--F <--G

Note that these arrows are attached to the child commit, not the parent, but since nothing about any commit can ever be changed, we can just remember that they point backwards like this, and draw them as connecting lines, to simplify the drawing:

...--E--F--G

We need a way to find the last commit of a branch, and that's where branch names like master and develop come in. In Git, the branch name stores the hash ID of the last commit on the branch—the tip of the branch, as Git calls it—and we can draw that as a name with another arrow:

...--E--F--G   <-- master

Whenever we're on a branch and make a new commit, Git adds the commit to the branch by making the new commit have, as its parent, the current tip:

...--E--F--G   <-- master
            \
             H

and then changing the branch name so that it points to the new commit we just made (and now we don't have to draw it on a separate line to leave room for the old master arrow):

...--E--F--G--H   <-- master

If you have two names pointing to the same commit, only one of them can be the current branch. We'll add (HEAD) to remember which one is current:

...--E--F--G   <-- develop (HEAD), master

Then we'll add a new commit:

...--E--F--G   <-- master
            \
             H   <-- develop (HEAD)

Now we'll check out master and add a new commit there too:

...--E--F--G--I   <-- master (HEAD)
            \
             H   <-- develop

The merge

Now let's make a new merge commit, by doing:

git checkout develop

which moves our HEAD:

...--E--F--G--I   <-- master
            \
             H   <-- develop (HEAD)

and then:

git merge master

(this may be kind of the wrong way around in general—many people prefer to merge from feature branches, into their more-masterful branches, but that's an entirely different question). To do the merge, Git:

  • locates the current commit (HEAD / commit H);
  • locates the other commit (master, commit I);
  • the "best" (nearest to the two tips) commit that's on both branches (commit G).

The last of these is the merge base for the merge. Git will then compare the merge base to the current commit, to see what we did:

git diff --find-renames <hash-of-G> <hash-of-H>

and then run a second comparison to see what they did:

git diff --find-renames <hash-of-G> <hash-of-I>

Git combines these changes, as best it can, applying both sets of changes to the merge base contents. If all goes well (or if we accidentally resolve the merge without resolving anything), we will get a merge commit, with two parents. The first parent will be the commit we're on right now, H; the second parent will be the other commit, I. Git will make our current branch name point to the new commit M (for merge):

...--E--F--G--I   <-- master
            \  \
             H--M   <-- develop (HEAD)

Although we can't really draw it well, even like this, this first and second parent notion is quite important; we'll see it again in a moment.

The revert

As you found out, git revert simply adds a new commit whose effect is to back out some previous commit's changes. Let's draw that commit in now, as commit R for revert:

...--E--F--G--I   <-- master
            \  \
             H--M--R   <-- develop (HEAD)

Git has to show you commits one at a time

When you run git log, Git will start from your current commit. Our HEAD is attached to our develop so that means Git follows the arrow from develop to commit R. Git shows us commit R, then moves on to R's parent, M.

Git now shows us M (and git log won't show a diff here even with -p since there are two parents it could diff against). Once it has done that, git log needs to show us M's parent ... but wait! There are two! Which one should it show?

What Git does at this point is to put both commits into a queue of "commits to show". It then picks one out of the queue, and shows it. The one it picks depends on the sorting options you choose when you run git log. (The default is to sort by committer-timestamp.) In your case, the one it picked was I, the tip of master. That puts I's parent G into the queue.

Git once again has two commits to choose from (G and H). It picks one, shows it to you, and puts that commit's parent into the queue. In our case, if it picks H, it puts G into the queue, but G is already in the queue, so the queue is now down to just one commit—and at this point the behavior becomes simple again (show G, then F, and so on).

In any case, git log has shown you this linearized view of something that's inherently not linear: the parents of merge M could come out in either order, and you can give git log sorting options that might change that order.

The suffix hat/caret (HEAD^) notation

While you can name a commit by its raw hash ID, this is often rather unpleasant to type. You can shorten them, but even then it's a bit tricky. You can use a mouse and cut-and-paste them, which is better. But there are many alternatives, all outlined in the gitrevisions documentation. One of these is the ^ suffix.

When you use the hat-suffix on a commit specifier, you are telling Git: Look up the commit I gave you, then find its parent. You can add a number after the hat, and if you do, you are telling Git: Look up the commit, then find the n'th parent. Most commits only have one parent, so only ^1 makes any sense, and you can omit the digit.

Hence, if HEAD is attached to develop, and develop names commit R, the string HEAD means commit R, but the string HEAD^ means the first parent of R, which is of course M. Note that you can write HEAD^1, if you like, to say the first parent of R, but of course there's just the one parent anyway.

With a merge, which has at least two parents, you can meaningfully select one of the two parents. We could therefore write HEAD^1^2 to mean: Starting from R, find its first parent, then find that commit's second parent. That would step from R to M and then from M to I.

You didn't use that, though; you used HEAD^1^1, spelled the simpler way, HEAD^^. That tells Git: Starting from R, find its first parent M, then find its first parent H. So this names commit H, as if you'd entered the hash of commit H on the command line.

When you run git reset --hard <commit-specifier>, you are telling Git to, first, resolve the <commit-specifier> part to a hash ID, and then, having located the commit, change the current branch name—the one HEAD is attached to—so that it points to that commit.

(Because Git always works backwards, commits after the chosen point become hard to find, unless you still have some other name you kept that lets you find them. It's always easy to work backwards: the hat suffix does that, for instance, and git log does that too. But Git literally can't go forwards unless it has first gone backwards and has remembered how it got there. You may eventually see this when you use some of the more complicated options to git rev-list.)

In your case, the commit you labeled a is equivalent to our R here, the one you labeled b is equivalent to our M here, and b's first parent must be the commit you labeled e:

...--e-----b--a
          /
 ...--d--c

so that a^ is b and a^^ is e.

Note that you can select commit d here with a^^2^: move from a to its first parent b, move from there to its second parent c, and move from there to its first parent d.

The tilde suffix

This also applies to git reset --hard HEAD~1

The tilde suffix is part of why this first parent notation is so important (for the rest of it, see git log --first-parent). A tilde, like a hat/caret, can be followed by a number. This number is, in effect, the number of times to repeat the ^1 operation. Given:

...--e-----b--a
          /
 ...--d--c

the name a~2 means a^^, which starts at a and moves back first-parent twice, to e. You cannot get to commits c or d this way as they're not along the first-parent chain; but a~3 would find the first (and probably only) parent of e, whatever that may be, and a~4 would find its parent, and so on.

torek
  • 448,244
  • 59
  • 642
  • 775
  • This was an absolutely amazing writeup. Thank you, torek. I really think this helped me have a much deeper understanding of how git flows. It sounds like had I done `git reset --hard HEAD^^2` that would have been the same as the solution I used. Again, thank you for being so thorough in your explanation. – Andrenikous Feb 21 '18 at 06:48