How to retrieve deleted code but still keeping the new code?

Question

So, I have this case of merge commit:

           --D--E--
          /        \
--A--B--C-          ---H---
          \        /
           --F--G--

Where :

Commit A, B, and C is in develop branch
Commit D, E is in feature branch
Commit F, G is in develop branch, merged from another feature branch
Commit H is merge feature into develop

The problem is, when i merged feature into develop, we lost some code introduced before the branch, for example :

Before the branch (A, B, C and before that), in file Z we have :

Code 1
Code 2

In feature branch (D, E), the committer removed Code 2 , he also added Code 3 so the file becomes :

Code 1
Code 3

In develop branch (F, G), Code 4 is added so the file becomes :

Code 1
Code 2
Code 4

Now after the merge (H), the file becomes :

Code 1
Code 3
Code 4

But the merge deleted Code 2 because it was deleted in feature.

What I want :

Code 1
Code 2
Code 3
Code 4

The merge didn't show any conflict for this file.

So how do I keep the old code while still having the new code?

One thing to note is that I have pushed this merge to the remote repo.

Your diagram is a little bit off, because merging _two_ branches into the `develop` branch should result in _two_ merge commits, not one. — Tim Biegeleisen, Nov 04 '16 at 01:16
You are in a bit of a conundrum. There is a way to tell Git to keep only the versions of one party in a merge. But in your case, this won't work because you want changes from both branches. One option here would be to use the history from your IDE Git plugin to simply add back the missing code. — Tim Biegeleisen, Nov 04 '16 at 01:18
@TimBiegeleisen No, the merge is from `feature` to `develop`, the other branch is already merged which results in commit F and G, so there's only one merge here. The merge from another branch which becomes F and G is not included because it's irrelevant. — Nosrep Elpoep, Nov 04 '16 at 01:25
"One option here would be to use the history from your IDE Git plugin to simply add back the missing code." @TimBiegeleisen So basically doing the good ol' copy and paste? — Nosrep Elpoep, Nov 04 '16 at 01:27
The bottom line: This sort of thing can happen in Git. Perhaps the best you could do here would be to force a merge conflict on every file modified by both branches. And you can read about this here: http://stackoverflow.com/questions/5074452/git-how-to-force-merge-conflict-and-manual-merge-on-selected-file — Tim Biegeleisen, Nov 04 '16 at 01:28
You need to realize that Git saw a newer feature branch removing some code, so it took that as being the version of things you actually wanted. — Tim Biegeleisen, Nov 04 '16 at 01:29

score 2 · Answer 1 · edited May 23 '17 at 12:33

Before I get into all the details below, if you're just asking how to find the deleted code (relatively) easily, you can run git diff on the merge base and the two commits. This is the same set of diffs that git merge merged. To find the merge base, use git merge-base:

$ merge=1234567...   # or some other way to locate the merge commit
$ git merge-base --all ${merge}^1 ${merge}^2

Ideally, this prints out just one commit ID, which is the merge base. You can then git diff that hash ID against ${merge}^1 (the merge's first-parent) and ${merge}^2 (the merge's second parent).

As Tim Biegeleisen noted in a comment, your diagram does not quite seem to match your text. Fortunately, from the text we can describe a much simpler situation, where problems arise from separate files' interactions. The effect is the same, but the setup is easier:

       A   <-- branch1
      /
...--*
      \
       B   <-- branch2

Here we have one commit A on branch1 and a different commit B on branch2, both descending from a common base branch whose tip commit is *. (We only really need to care that commit * exists, for the upcoming merge.)

Suppose that in the merge base, there is a function prepare() defined in shared.py that is not actually used anywhere. It was meant for use in branch1, and is now used in branch1 from work.py. But the author of commit B in branch2 decided to delete prepare() precisely because it was never used.

You now wish to combine commits A and B, perhaps as a new commit on branch1:

$ git checkout branch1
$ git merge branch2

Git compares commit * to commit A: this says, among other things, "add call to prepare in file work.py". This is no problem for Git, because commit B does not even touch file work.py.

Git then compares commit * to commit B: this says, among other things, "remove call to prepare from file shared.py". This is no problem for Git either, because commit A does not touch this part of shared.py (and maybe even does not touch shared.py at all).

The result is a new merge commit:

       A---C   <-- branch1
      /   /
...--*   /
      \ /
       B   <-- branch2

Git is a tool (pun maybe not intended), not a solution

It is your job, as the person doing the merge, to tell whether Git got the merge right. Git will get this merge wrong, in part because whoever wrote commit B did the wrong thing. If you simply run git merge and push the result because the merge succeeded, that's your mistake right there, and you will need to recover from it. We'll get to this in a moment, but first, let's look at ways to avoid or correct the mistake early.

Test thoroughly

This is not perfect but has the advantage of being easy to do by computer. If there are good tests, you can do the git merge and then run the tests. If the tests fail, you know something is wrong with the merge—so stop using it, and go back and fix it.

Scan the merge manually

This catches more errors, but also misses more errors, because it can catch problems that are not being tested-for, but relies on fallible humans.

If you spot a problem this way that does not show up during testing, you may be able to write an automated test to detect it. If that does not take too long, that's a good idea.

Options for fixing the problem before pushing

Since the merge is not used anywhere, you can just un-make it:

$ git reset --hard HEAD^

which discards the merge (commit C) so that branch1 points to A again. We do not need to keep the index or work-tree either, since Git made the merge in a fully automated manner. We now must decide how to fix the problem.

Option 1: restore the missing code on `branch2`

$ git checkout branch2
... edit shared.py to restore the code ...
$ git add shared.py; git commit

This gives us:

       A      <-- branch1
      /
...--*
      \
       B--C   <-- branch2

where a merge to create D on branch1 will again succeed, but not remove the necessary code. We can re-merge and re-test.

Option 2: do the merge but don't commit it:

$ git merge --no-commit branch2

Now we can fix up shared.py, git add it, and commit. This option is fast, but a little bit dirty, because we now have a merge commit that has a manual fixup from a problem not seen by Git itself. If the problem is found through automated testing, and we consistently use automated testing, and for some reason we have to repeat this merge in the future, we'll re-discover the problem. If we also comment on this in the merge message, that's probably sufficient.

This option does have the advantage that we did not have to add another commit to branch2.

There's a slight variant of Option 1 that is more work than either, but may be the best of all for some cases (those where disturbing branch2 is a problem): make a new branch with the fixed version of B, and merge that, so that we get:

       A---D   <-- branch1
      /   /
...--*   C     <-- fixed-branch2
      \ /
       B       <-- branch2

Option 3: Fixing things after-the-fact

Of course, what we have now is that the merge has been pushed (or published). Retracting published merges is even harder than retracting published regular (non-merge) commits, because reverting a merge sets us up to have a different but related failure on subsequent merges. So we can just leave the "bad" version out there and simply make a fix and push that:

       A---C--D   <-- branch1
      /   /
...--*   /
      \ /
       B   <-- branch2

Commit D, which we make as the new tip of branch1, just puts back the missing function—the same as we would do if we were fixing branch2 directly.

This leaves C behind as a "broken commit". That's basically just the way things are: people have that commit it; people are using it; so we're kind of stuck with it. It causes headaches later if/when we go to use git bisect to find a problem, since commit C doesn't really work, but the only alternative is to can get everyone else to scrub the broken commit C from their repositories—in which case, you can "rewrite history" to pretend you used one of the first two options in the first place.

(Fortunately git bisect can deal with broken commits—and if you are going to preserve history, you can use git notes to annotate these commits for automation during bisections. See http://xkcd.com/974/ for details. :-) )