1

This must be a well understood problem, yet I seem to be unable to find relevant information.

TL;DR: git's mapping between code of the two branches seems broken due to messy history of the branches.

The setup:

I am maintaining an "extension" fork. I.e.:

  • I have a fork repository, which is a copy of an origin. (And I have a local repository, which distinguishes fork and origin remotes.)
  • Fork contains custom extension to the project. I commit my changes onto fork/master.
  • Whenever there is an update of origin/master, I merge it into fork/master. (I.e., I directly fetch and merge origin/master into local repository and push into fork/master.)
  • In time, the 'fork/master' sometimes diverges from master significantly, but later converges back again, so most of the code is still in a "1-1" relation.

The problem:

This way, "diverged" commits accumulate on fork/master (hundreds of them). Lately, I have been experiencing (more and more) problems with conflict resolution:

  • Git merge produces a lot of conflicts whose cause is unclear.

  • Git merge often autoresolves conflicts incorrectly. In practice this means that after a merge, I often find "random" pieces of code inserted or deleted at "random" places. These changes are utterly wrong, but were applied without me having to review them.

  • (The code mapping between code sections is totally broken, which obviously is the cause of the problem yet makes it impossible to manually review the automatically-applied changes in the merge commit.)

I assume that the cause is the three-way merge which tries to take into account entire commit history of the branches since first diverged commit.

Solution attempt:

I tried to "rebase" the fork into a single squashed commit on top of origin/master and merge this branch into fork/master:

git checkout -b master_rebased master
git reset --soft origin/master
git commit
git checkout master
git merge master_rebased`

This way fork/master has now in fact two alternative histories. I argued that git would always automatically compute the "shortest" diverge path (fully containing entire changeset) before computing merges or diverge statistics, but no - git still claims that my branch is hundreds commit ahead of origin/master.

Of course, I can simply throw out old fork/master and replace it by the rebased master, but it sounds as a very unclear way of dealing with the problem, which probably means that I am missing something.

Summary:

So the questions basically are:

1) What is the cleanest way to reconcile a diverged branch?

2) What is the proper git workflow to maintain a long term extension fork?

3) Alternatively: what am I missing?

waldir
  • 21
  • 4

1 Answers1

1
  1. Alternatively: what am I missing?

Well, I was obviously missing proper search keywords - "long-lived topic branch" or "long-lived feature branch".

  1. What is the proper git workflow to maintain a long term extension fork?

Searching leads to a bunch of approaches, which however do not reveal anything new. The recommended ways are:

  • To simply use merges. Obviously does not solve my problem, but is the easiest way unless conflict resolution problems pop up.

  • To use rebases, while rewriting/discarding commit history time to time.

    This comment https://stackoverflow.com/a/7752610/6922501 shows how to deal with the problem of rewriting someone else's history - basicaly, to have the affected party rebase their work onto the rebased branch.

  • To use git rerere, which basicaly records resolutions of merge conflicts and replays them automatically whenever the same conflict appears multiple times.

  1. What is the cleanest way to reconcile a diverged branch?

I am still unclear on this point, but it seems that in the end, one has to pick either rebase or merge+rerere with all its consequences (good and bad).

Sources:

waldir
  • 21
  • 4