Git merge two overlapping repositories

Question

I have this configuration:

RepoA
  |-Dir1

RepoB

The problem is that RepoB is copied and modified Dir1. Is there a way how to merge RepoB into RepoA without loosing history on RepoB?

Is `Dir1` a git repository by itself? If not, how can `RepoB` be a copy of it? Did you copy `Dir1` and then `git init`ed it? — Shahbaz, Jul 31 '12 at 13:35
@Shahbaz Dir1 is not a git repository by itself. Yeah, it wasn't my brightest moment. — liborw, Jul 31 '12 at 13:47
haha, well I don't know what to do now, but maybe you could take a look at "sub-modules" or something. I don't know them, but if you can turn Dir1 into a sub-module, then add `../../RepoB` as a remote to it, then fetch/merge, you'll get Dir1 updated. Then, you could try to see if it is possible to get Dir1 back as a directory rather than a sub-module. Note: test on a test repository first!! — Shahbaz, Jul 31 '12 at 13:54
Are you looking for something like [graft](http://stackoverflow.com/questions/161928/what-are-git-info-grafts-for)? — Roman, Jul 31 '12 at 14:08
possible duplicate of [Combining multiple git repositories](http://stackoverflow.com/questions/277029/combining-multiple-git-repositories) — Roman, Jul 31 '12 at 14:11

score 0 · Answer 1 · answered Dec 20 '14 at 06:35

There definitely is a way to rescue the history in RepoB, which, if it is understood correctly, represents a version of Dir1 under RepoA.

First, I would begin by rewriting the history of the development branch in RepoB so that all the files are renamed according to the pattern * -> Dir1\*. That is to say, I would go to the earliest commit, perform a git mv of all the files into the subdirectory, and then rebase everything else. This is doable with git rebase --interactive. I would create a branch for this that does not exist in RepoA. Let's call that the rescue branch.
Next, I would point both RepoA and RepoB to a common ancestor. If RepoA has an upstream, then RepoB's .git/config can be edited to point to that same upstream.
Then we push the rescue branch in RepoB into the upstream, and do a git fetch in RepoA to pick that up. RepoA now has the materials.
At this point, in RepoA, we can cherry pick the rescue branch commits into the development branch.

Concrete scenario, from scratch. In this session, we create a repo-a which is cloned from upstream. In repo-a we create a dir1/file. Then we create a new git repo repo-b as a copy of dir1. We do some hacking on file. Then we rewrite the history so that the file is moved to dir/file and the changes are replayed, and we do that in a branch called rescue. We establish upstream as the remote of repo-b and push rescue there. We then pull rescue into repo-a, rebase rescue on top of master, and then install rescue as master.

Note: The history rewriting in rescue was hacked using some manual commands. I first hard reset to the base commit, did a git mv to move the file to the subdirectory, and amended that root commit. Then I cherry picked the next commit after that. In reality, I would do a git rebase --interactive <root-commit-sha>, but that is difficult to show as a logged session.

~$ mkdir upstream
~$ mkdir repo-a
~$ cd upstream
~/upstream$ git init --bare
Initialized empty Git repository in /home/kaz/upstream/
~/upstream$ cd ../repo-a
~/repo-a$ git clone ../upstream .
Cloning into '.'...
done.
warning: You appear to have cloned an empty repository.
~/repo-a$ mkdir dir1
~/repo-a$ cd dir1
~/repo-a/dir1$ cat > file
abc
def
~/repo-a/dir1$ git add file
~/repo-a/dir1$ git commit -m "file added"
[master (root-commit) 1b5cdf9] file added
 1 file changed, 2 insertions(+)
 create mode 100644 dir1/file
~/repo-a/dir1$ git push origin master
Counting objects: 4, done.
Writing objects: 100% (4/4), 252 bytes, done.
Total 4 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (4/4), done.
To /home/kaz/repo-a/../upstream
 * [new branch]      master -> master

Okay, now here comes the silly part: someone makes a copy of repo-a/dir1, calling it repo-b, and initializes that as a Git repo, adding file under version control and committing:

~/repo-a/dir1$ cd ../..
~$ cp -a repo-a/dir1 repo-b
~$ cd repo-b
~/repo-b$ git init
Initialized empty Git repository in /home/kaz/repo-b/.git/
~/repo-b$ git add file
~/repo-b$ git commit -m "repob"
[master (root-commit) fe0c898] repob
 1 file changed, 2 insertions(+)
 create mode 100644 file

Now, that same person starts working on file, adding material to it:

~/repo-b$ cat file
abc
def
~/repo-b$ cat >> file
ghi
~/repo-b$ git diff
diff --git a/file b/file
index 5f5521f..8edb37e 100644
--- a/file
+++ b/file
@@ -1,2 +1,3 @@
 abc
 def
+ghi
~/repo-b$ git commit -a -m "ghi"
[master 3e35a62] ghi
 1 file changed, 1 insertion(+)

Now the person realizes, "Oh no, what am I doing! I want this change over in repo-a, under dir1!".

First step is to connect repo-b to upstream:

~/repo-b$ git remote add origin ../upstream

Now, stop working on master, and change over to a rescue branch which starts out identical to master:

~/repo-b$ git checkout -b rescue
Switched to a new branch 'rescue'
~/repo-b$ git log
commit 3e35a6216f8788cc7a58f7bb84a2dfaf8e47e720
Author: Kaz <kaz@stackoverflow.help.com>
Date:   Fri Dec 19 22:17:54 2014 -0800

    ghi

commit fe0c898b11124d0174e65b32fdcc956443446dcf
Author: Kaz <kaz@stackoverflow.help.com>
Date:   Fri Dec 19 22:17:25 2014 -0800

    repob

Now, rewrite the history of rescue so that the file is under dir1. First, reset to the root commit, and do the rename:

~/repo-b$ git reset --hard fe0c898b11124d0174e65b32fdcc956443446dcf
HEAD is now at fe0c898 repob
~/repo-b$ mkdir dir1
~/repo-b$ git mv file dir1
~/repo-b$ git commit -a --amend -m "repob"
[rescue e3a6c19] repob
 1 file changed, 2 insertions(+)
 create mode 100644 dir1/file

Now, we cherry pick the change on top. The change nicely patches over dir1/file, following the rename:

~/repo-b$ git cherry-pick 3e35a6216f8788cc7a58f7bb84a2dfaf8e47e720
[rescue 7f21bf5] ghi
 1 file changed, 1 insertion(+)
~/repo-b$ ls
dir1
~/repo-b$ cat dir1/file 
abc
def
ghi

Okay, now we push the rescue package to ../upstream:

~/repo-b$ git push origin rescue
Counting objects: 8, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (8/8), 507 bytes, done.
Total 8 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (8/8), done.
To ../upstream
 * [new branch]      rescue -> rescue

Off to repo-a once again. There we do a git fetch to bring in the rescue branch and its objects:

~/repo-b$ cd ..
~$ cd repo-a
~/repo-a$ git fetch
remote: Counting objects: 8, done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 8 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (8/8), done.
From /home/kaz/repo-a/../upstream
 * [new branch]      rescue     -> origin/rescue

From here on in, it is Git 101 material. We rebase the rescue branch on top of master.

~/repo-a$ git checkout rescue
Branch rescue set up to track remote branch rescue from origin.
Switched to a new branch 'rescue'
~/repo-a$ git rebase master
First, rewinding head to replay your work on top of it...
Applying: ghi
~/repo-a$ ls
dir1
~/repo-a$ cat dir1/file 
abc
def
ghi
~/repo-a$ # YAY!

Then we switch back to master, and fast-forward it to rescue:

~/repo-a$ git checkout master
Switched to branch 'master'
~/repo-a$ git reset --hard rescue
HEAD is now at 2546207 ghi
~/repo-a$ cat dir1/file 
abc
def
ghi

Done.

Git merge two overlapping repositories

1 Answers1