Before I get into all the details below, if you're just asking how to find the deleted code (relatively) easily, you can run git diff
on the merge base and the two commits. This is the same set of diffs that git merge
merged. To find the merge base, use git merge-base
:
$ merge=1234567... # or some other way to locate the merge commit
$ git merge-base --all ${merge}^1 ${merge}^2
Ideally, this prints out just one commit ID, which is the merge base. You can then git diff
that hash ID against ${merge}^1
(the merge's first-parent) and ${merge}^2
(the merge's second parent).
As Tim Biegeleisen noted in a comment, your diagram does not quite seem to match your text. Fortunately, from the text we can describe a much simpler situation, where problems arise from separate files' interactions. The effect is the same, but the setup is easier:
A <-- branch1
/
...--*
\
B <-- branch2
Here we have one commit A
on branch1
and a different commit B
on branch2
, both descending from a common base branch whose tip commit is *
. (We only really need to care that commit *
exists, for the upcoming merge.)
Suppose that in the merge base, there is a function prepare()
defined in shared.py
that is not actually used anywhere. It was meant for use in branch1
, and is now used in branch1
from work.py
. But the author of commit B
in branch2
decided to delete prepare()
precisely because it was never used.
You now wish to combine commits A
and B
, perhaps as a new commit on branch1
:
$ git checkout branch1
$ git merge branch2
Git compares commit *
to commit A
: this says, among other things, "add call to prepare
in file work.py
". This is no problem for Git, because commit B
does not even touch file work.py
.
Git then compares commit *
to commit B
: this says, among other things, "remove call to prepare
from file shared.py
". This is no problem for Git either, because commit A
does not touch this part of shared.py
(and maybe even does not touch shared.py
at all).
The result is a new merge commit:
A---C <-- branch1
/ /
...--* /
\ /
B <-- branch2
Git is a tool (pun maybe not intended), not a solution
It is your job, as the person doing the merge, to tell whether Git got the merge right. Git will get this merge wrong, in part because whoever wrote commit B
did the wrong thing. If you simply run git merge
and push the result because the merge succeeded, that's your mistake right there, and you will need to recover from it. We'll get to this in a moment, but first, let's look at ways to avoid or correct the mistake early.
Test thoroughly
This is not perfect but has the advantage of being easy to do by computer. If there are good tests, you can do the git merge
and then run the tests. If the tests fail, you know something is wrong with the merge—so stop using it, and go back and fix it.
Scan the merge manually
This catches more errors, but also misses more errors, because it can catch problems that are not being tested-for, but relies on fallible humans.
If you spot a problem this way that does not show up during testing, you may be able to write an automated test to detect it. If that does not take too long, that's a good idea.
Options for fixing the problem before pushing
Since the merge is not used anywhere, you can just un-make it:
$ git reset --hard HEAD^
which discards the merge (commit C
) so that branch1
points to A
again. We do not need to keep the index or work-tree either, since Git made the merge in a fully automated manner. We now must decide how to fix the problem.
Option 1: restore the missing code on branch2
$ git checkout branch2
... edit shared.py to restore the code ...
$ git add shared.py; git commit
This gives us:
A <-- branch1
/
...--*
\
B--C <-- branch2
where a merge to create D
on branch1
will again succeed, but not remove the necessary code. We can re-merge and re-test.
Option 2: do the merge but don't commit it:
$ git merge --no-commit branch2
Now we can fix up shared.py
, git add
it, and commit. This option is fast, but a little bit dirty, because we now have a merge commit that has a manual fixup from a problem not seen by Git itself. If the problem is found through automated testing, and we consistently use automated testing, and for some reason we have to repeat this merge in the future, we'll re-discover the problem. If we also comment on this in the merge message, that's probably sufficient.
This option does have the advantage that we did not have to add another commit to branch2
.
There's a slight variant of Option 1 that is more work than either, but may be the best of all for some cases (those where disturbing branch2
is a problem): make a new branch with the fixed version of B
, and merge that, so that we get:
A---D <-- branch1
/ /
...--* C <-- fixed-branch2
\ /
B <-- branch2
Option 3: Fixing things after-the-fact
Of course, what we have now is that the merge has been pushed (or published). Retracting published merges is even harder than retracting published regular (non-merge) commits, because reverting a merge sets us up to have a different but related failure on subsequent merges. So we can just leave the "bad" version out there and simply make a fix and push that:
A---C--D <-- branch1
/ /
...--* /
\ /
B <-- branch2
Commit D
, which we make as the new tip of branch1
, just puts back the missing function—the same as we would do if we were fixing branch2
directly.
This leaves C
behind as a "broken commit". That's basically just the way things are: people have that commit it; people are using it; so we're kind of stuck with it. It causes headaches later if/when we go to use git bisect
to find a problem, since commit C
doesn't really work, but the only alternative is to can get everyone else to scrub the broken commit C
from their repositories—in which case, you can "rewrite history" to pretend you used one of the first two options in the first place.
(Fortunately git bisect
can deal with broken commits—and if you are going to preserve history, you can use git notes
to annotate these commits for automation during bisections. See http://xkcd.com/974/ for details. :-) )