Really flatten a git merge

Question

There're few question about "flattening merge" on StackOverflow, with an answer usually being "git rebase". These answers though miss one crucial point - order of commits.

Suppose there's a branch A with commits of Jun 1 and Aug 1, and branch B with commit of Jul 1 (UPDATE to reinstate the usecase described below: branches are fully independent and don't have common ancestry, for example coming from 2 different repositories). When merging B into A, there will be following history (per git log):

Merged branch 'B'
Aug 1
Jul 1
Jun 1

Now, what I'm looking for is the way to get the same result, but without merge commit (and thus with underlying linear history in that order, and yes, that means re-parenting commits). git rebase doesn't help here, as with it, you will get following histories:

Jul 1
Aug 1
Jun 1

or

Aug 1
Jun 1
Jul 1

In other words, git rebase always stacks one branch on top of another, while I'm looking for solution which will intersperse commits sorted by author's commit date.

Apparently, for simple cases, needed arrangement can be achieved by manually postprocessing git rebase with git rebase -i, but that's not practical for large histories, so I'd be looking for automated command/script.

Usecase? If A and B represent different parts of the same project which happened to be in different repos and time has come to correct that by merging them together, then it's natural to want the linear history unfolding in the actual order of development.

It sounds like you're trying to bolt a linear development path onto a non-linear version control system. It may seem "natural" to want to put all the commits in chronological order, but it would be a false history, as your teams were not, in fact, collaborating with one another at the time. What's of critical importance is the final state, integrating the efforts of the two teams, rather than after the fact incomplete merge steps. — Peter Bratton, Sep 04 '12 at 20:44
@jordan002: The question specifies the fact that the "teams" were "collaborating" on the two branches as the starting condition. As for "critical importance", this question is exactly about what it is, not about opinions on development methodologies. — pfalcon, Sep 05 '12 at 04:03
@pfalcon: Actually, it doesn't say that anywhere in your question. Further, what is the actual problem that you're trying to solve here? We understand what you _want_ to do; but what is the problem that you're trying to solve? — Infiltrator, Sep 05 '12 at 05:31
FWIW, `git rebase` handles the merges reasonable good. I.e. the order of commits is preserved as one would expect. Except when you expect them to be in chronological order, because according to the _non-linear_ history the `rebase` has to deal with the commits are _not_ in that order. — fork0, Sep 05 '12 at 06:48
@Infiltrator: I gave down-to-earth example in a comment to your answer below. Otherwise I indeed tried to formulate question as abstract git one and thus reusable, rather than "spur of the moment" one. — pfalcon, Sep 12 '12 at 17:00
I am currently facing a similar situation and understand why @pfalcon wants this. In my case, the two teams WERE collaborating and, in fact, commits in one repo logically relate to and require commits in the other repo. So, time-order really does make sense. — Chris Cleeland, Jul 11 '13 at 14:04
Just another use case for this question: I have a svn repo with several externals which point to different repos. I want to start over with git as a 'new branch' which contains everything necessary but with less complexity for all new releases from now. After converting svn->git all repos I'll have to filter-branch their directory structure, then merge in chronological order so that the history looks as if there never were externals. This will result in a compilable history at least back to the time when the externals changed the last time. Thanks for the answers. I will have a look at them. — Daniel Alder, Aug 28 '15 at 15:23

score 15 · Accepted Answer · edited May 23 '17 at 12:06

After some thinking, I figured out how to do How do I run git rebase --interactive in non-interactive manner? , which also provides completely scripted solution for this question.

1. Bring 2 branches from different repositories into one repository (git remote add + git fetch)

2. Rebase (non-interactively) one branch on top of another (order matters, consider first commit of which branch you'd like to have as first commit of consolidated branch).

3. Prepare following script (rebase-reoder-by-date):

#!/bin/sh
awk '
/^pick/ {
            printf "%s %s ", $1, $2;
            system("echo -n `git show --format='%ai' -s " $2 "`");
            for (i = 3; i <= NF; i++) printf " %s", $i; printf "\n";
        }
' $1 | sort -k3 > $1.tmp
mv $1.tmp $1

4. Run: GIT_SEQUENCE_EDITOR=./rebase-reoder-by-date git rebase -i <initial commit>

Disclaimer: all these operations should happen on copies of original repositories, review/validate/test combined branch to make sure it is what you expected and contains what you expect, keep backups handy.

score 2 · Answer 2 · edited May 23 '17 at 12:06

[See my another answer for completely automated solution. I'd leave this as an example of path which led for ultimate solution, in case someone will face similar not-so-obvious to solve task.]

Ok, this is not real answer to the question (fully scripted, automated solution), but thinking and example how (interactive rebase based) processing can be automated.

Well, first of all, for the ultimate solution git filter-branch --parent-filter looks like exactly what's needed. Except that my git-fu doesn't allow me to wrote, 1-, 2-, or 3-liner with it, and approach to write standalone script to parse thru all revisions is not cool and more effortful than rebase -i.

So, rebase -i could be used efficiently if author dates of commit were visible. My first thought was to temporarily patch commit messages to start with author date using git filter-branch --msg-filter, run rebase -i, then unpatch messages back.

Second thought though was: why bother, better to patch rebase commit list as used by rebase -i. So, the process would be:

Bring branches A and B from different repos into one repo, as usual.
Rebase (non-interactively) one branch on another. Consider which branch should be rebased on which, to have initial commit right (which cannot be easily rewritten with rebase).
Start git rebase -i
In another console, go to $REPO/.git/rebase-merge/
Run: awk '/^pick/ {printf "%s %s ", $1, $2; system("echo -n git show --format='%ai' -s " $2 ""); for (i = 3; i <= NF; i++) printf " %s", $i; printf "\n"; }' git-rebase-todo > git-rebase-todo.new; mv git-rebase-todo.new git-rebase-todo
This seems just the right place/way to reorder commits either: sort -k3 git-rebase-todo >git-rebase-todo.new; mv git-rebase-todo.new git-rebase-todo
Switch to original console and reload git-rebase-todo file in editor, then exit editor.

Voila! Actually, this could be completely scripted if git rebase -i could work in "non-interactive" mode, I submitted How do I run git rebase --interactive in non-interactive manner? for that.

score 0 · Answer 3 · answered Sep 05 '12 at 05:30

0

What is the problem with leaving seperate development in seperate lines up until they were merged? If they were seperate, then they were seperate.

There are many ways to view the history in chronological order without hacking the history as you're trying. Have you tried git log --pretty --date-order?

answered Sep 05 '12 at 05:30

Infiltrator

1,611
1
16
25

1

Ok, if generic description in the question is not enough, here's more concrete example: client and server parts of project were initially created as 2 separate git repos. But their development went in parallel, like feature was added to a server, and related code added to client, etc. So, there were not "separate lines of development", only repos were separated. At later time, it became apparent that both client and server are *one* project, and they were worked on as such, and what's left is to merge them into 1 repo which would represent their *common* line of development. – pfalcon Sep 12 '12 at 16:52
You can replace "server" and "client" above with "main app" and "library", or with "implementation in language A" and "implementation in language B", or with "interface" and "implementation". Clearly, such usecase is more or less generic, and that's how I formulated the question, wanting to find solution reusable by community, not scratching just my momentarily itch. And yes, it's more like intellectual challenge ("git can do a lot, can it do this"). So, yes, I'd like to find solution which would make repo look like it would have been if devel was done "right" from start, not just a workaround – pfalcon Sep 12 '12 at 16:58
1

For the record, I came across this SO question because I'm trying to merge two Git repositories which were cloned from Subversion. The Subversion/Git conversion process isn't very good at picking individual subdirectories, so we created separate Git repositories. – Huw Walters Apr 28 '14 at 15:46

score 0 · Answer 4 · edited Sep 03 '15 at 17:14

0

Actually, If I understand correctly, you can achieve this easily with git-stitch-repo.

edited Sep 03 '15 at 17:14

Daniel Alder

5,031
2
45
55

answered Oct 22 '13 at 16:09

weynhamz

1,968
18
18

1

Interesting tool Unfortunately, the result are different branches, not one. The result of this tool is the starting point for this question – Daniel Alder Sep 03 '15 at 16:11

Really flatten a git merge

4 Answers4

Linked