This must be a well understood problem, yet I seem to be unable to find relevant information.
TL;DR: git's mapping between code of the two branches seems broken due to messy history of the branches.
The setup:
I am maintaining an "extension" fork. I.e.:
- I have a
fork
repository, which is a copy of anorigin
. (And I have a local repository, which distinguishesfork
andorigin
remotes.) - Fork contains custom extension to the project. I commit my changes onto
fork/master
. - Whenever there is an update of
origin/master
, I merge it intofork/master
. (I.e., I directly fetch and mergeorigin/master
into local repository and push intofork/master
.) - In time, the 'fork/master' sometimes diverges from master significantly, but later converges back again, so most of the code is still in a "1-1" relation.
The problem:
This way, "diverged" commits accumulate on fork/master
(hundreds of them). Lately, I have been experiencing (more and more) problems with conflict resolution:
Git merge produces a lot of conflicts whose cause is unclear.
Git merge often autoresolves conflicts incorrectly. In practice this means that after a merge, I often find "random" pieces of code inserted or deleted at "random" places. These changes are utterly wrong, but were applied without me having to review them.
(The code mapping between code sections is totally broken, which obviously is the cause of the problem yet makes it impossible to manually review the automatically-applied changes in the merge commit.)
I assume that the cause is the three-way merge which tries to take into account entire commit history of the branches since first diverged commit.
Solution attempt:
I tried to "rebase" the fork into a single squashed commit on top of origin/master
and merge this branch into fork/master
:
git checkout -b master_rebased master
git reset --soft origin/master
git commit
git checkout master
git merge master_rebased`
This way fork/master
has now in fact two alternative histories. I argued that git would always automatically compute the "shortest" diverge path (fully containing entire changeset) before computing merges or diverge statistics, but no - git still claims that my branch is hundreds commit ahead of origin/master
.
Of course, I can simply throw out old fork/master
and replace it by the rebased master, but it sounds as a very unclear way of dealing with the problem, which probably means that I am missing something.
Summary:
So the questions basically are:
1) What is the cleanest way to reconcile a diverged branch?
2) What is the proper git workflow to maintain a long term extension fork?
3) Alternatively: what am I missing?