How can I better handle commits polluting a "closed" branch in git?

Question

We're new to git at my company, coming to git from Subversion, and over the weekend we ran in to a situation with a branch in our repository where commits were made to the public version of a branch that we didn't want there. We had:

A -> B

And then got the bad commits to put the branch at:

A -> B -> C -> D

C and D should never have been on that branch. The trouble is this branch was "closed" -- this was a released version of our software and there shouldn't have been any new commits to this branch.

In Subversion the only way out of this sort of situation was to commit !D and !C so you end up with:

A -> B -> C -> D -> !D -> !C

Which gets me back to B but keeps me moving forward on the timeline for the branch so anyone with a remote of the branch who syncs with the master repository would get C and D and then have them undone to end up at a logically similar version of B (but not B -- call it B').

I came across this solution for reverting commits in git which seemed ideal: it would put our public repository back to A -> B. But it meant that any clone of this branch out on anyone's working machine would be very incorrect and everyone would need to re-clone. My fix amounted to:

git checkout thebranch
git reset --hard <<commit # associated with commit B>>
git push --force

I ended up going the route of the above link and it caused quite the stir that:

a) You can throw away commit history at the public repository with git like this, literally re-writing commit reality;

b) Everyone had to re-clone so they wouldn't risk re-injecting C -> D on to the branch (or the new branch of that branch that we wanted to create).

I think I should have done:

git revert HEAD~2
git commit
git push

But this would have left the branch as A -> B -> C -> D -> E and it really shouldn't have C -> D -> E on it because it's supposed to be closed.

I've got three questions:

How could I have handled the clean up better? Use revert instead of reset? What's the best practice here for branch pollution?
Did the push --force of the reverted branch really destroy the history at the public repository? Or did git roll back to B but keep a record of C -> D and that a revert was done back to B at some point by me? It definitely doesn't show the revert in the commit log, but maybe a record of my action is kept some place else?
How do you handle "closed" branches in git such that these changes couldn't have gotten on there in the first place? We did have a tag applied to the repository at commit B and people are supposed to use the branch + tag to get the source for the release, but this is still a scary thing to have changes show up on a branch line that should not have changes on it after commit B. And someone branching from the branch for a patch release could have easily missed the tag and pulled C -> D in to their new branch as well.

@pydave: sorry, yes, I meant a tag. Technology cross-contamination in my verbiage. :) — Ian C., Apr 12 '11 at 18:52

score 5 · Accepted Answer · answered Apr 12 '11 at 18:22

You should use tags for releases ("closed branches") and branches for development ("open branches"). This is both an alternative solution to fixing your problem (you could just use the tag for your released code) and a way to prevent this problem in the future.

While you're developing v3.1 you can have your v3.1 branch. Once v3.1 is complete, tag the last commit and rename the branch to your next development branch (v3.2). Remember: Git is not Svn! A branch is only a pointer to a commit. Deleting a branch doesn't delete the commits (but if those commits are not included in another branch, they'll be dangling so you should make a tag before you delete the branch).

If you want to develop a patch to v3.1 (v3.1.1), then you create a branch at the tag v3.1:

git checkout -b v3.1.1 v3.1

This will be clearer to your developers (branches are for development and tags are for releases) and prevent this issue from coming up again.

Did the push --force of the reverted branch really destroy the history at the public repository?

No. If you had created a tag or branch for commit D, then that branch would still be fine. Use git reflog to see recent changes to the head. Those commits should be in there. (Or git fsck.)

For a good discussion of git that doesn't use other version control systems as a basis for understanding, try PeepCode's Git Internals. It's not free, but I think it's a lot better to understand git apart from centralized version control. (The Git Community Book looks like a good free alternative.)

Thanks. This is something that was not immediately obvious (tagging the branch, then renaming it to continue development). Very good to know. — Ian C., Apr 12 '11 at 18:45

score 1 · Answer 2 · answered Apr 12 '11 at 18:17

Regarding point 2: No; resetting a branch (locally or on the server) will never destroy history. In git, a branch is akin to a label that is just pointing to a specific commit (and the label automatically moves forward when the branch is checked out and a new commit is made). Resetting a branch just means that the label is moved backward; the commits themselves remain, but become "invisible" if there are no other branches pointing to them. In this case, they can be recovered using git reflog, which will show you the hashes of all commits, even those without a branch. You will then see that the "deleted" commits are still there. The only thing that can destroy them is git gc (which is also run sporadically by git itself), which removes unreferenced commits.

When you reset a branch to an earlier point and force-push it, other developers can reset their branch by using git fetch -f origin branchname:branchname; there is no need to re-clone (however, they should first checkout a different branch in order to prevent the index from getting messed up by the force-fetch). However, this will cause them to "lose" commits (again, they can be found with git reflog) if they have made commits to the branch past the ones you've accidentally pushed. This is probably not a problem for you, since the branch was supposed to be closed anyway, but if it happens, they can create a new branch before force-fetching; then the new branch will still point to their newest commits, and git rebase can transplant the commits onto the proper branch.

I wish I could accept two answers. The second half of your answer is excellent. Thank you. — Ian C., Apr 12 '11 at 19:10
@Ian C.: Glad it helped. (Can I improve the first half somehow? ;-) ) — Aasmund Eldhuset, Apr 12 '11 at 19:14
@Aasumnd - no. @pydave answered 1 and 3 for me, you answered 2. 3 was the most important to me. — Ian C., Apr 12 '11 at 19:42
@Ian C.: I didn't mean that you should accept my answer instead; I was just wondering what you felt was lacking with the first part since only the second part was deemed "excellent" - but it's not important; sometimes I'm just too eager to be a perfectionist. :-) — Aasmund Eldhuset, Apr 12 '11 at 20:26

How can I better handle commits polluting a "closed" branch in git?

2 Answers2