0

I have an existing project that uses a single repository. The directory structure looks like:

* MyProject
  * client
  * server
  * tester
  * documentation
  * deployment
  * graphics

I would like to modify it so that client, server, and tester are individual repositories; and MyProject still exists with the other subdirectories in it, and it hosts client etc. via git subtree.

So the end result will be that I have the same directory structure, but I can perform version control on client etc. individually without disturbing the rest of the repo.

What commands should I use to achieve this? I have access to log into the remote repo's server and issue git commands directly on the repo.


Bonus extra: Currently I have got all of the commits for client together in the history of MyProject; if possible I would like to keep this history for the new subtree project.

M.M
  • 138,810
  • 21
  • 208
  • 365

2 Answers2

0

I was just reading a very interesting page about subtree: http://blogs.atlassian.com/2013/05/alternatives-to-git-submodule-git-subtree/

So, without remote tracking you can do the following:

1) He says to first create the subtree, and provides this snippet. To retain the history on this directory you simply avoid "--squash", although I think this history remains stored in the main repo.

git subtree add --prefix .vim/bundle/tpope-vim-surround https://bitbucket.org/vim-plugins-mirror/vim-surround.git master

2) then just fetch the remote repo

I suspect you'll want to follow his directions for including remote tracking, which is quite different.

ThatsAMorais
  • 659
  • 8
  • 16
  • I have read that page several times and did not understand what it was saying; your answer does not seem to add anything to that page's text either – M.M May 08 '15 at 05:48
  • His example (if I understand it correctly) combines two different existing repos to have one subtree host the other; however I currently have a single repo which I want to change to be different repos with subtree hosting. – M.M May 08 '15 at 05:50
  • Fair enough, I should have commented with a clarifying question, perhaps. Glad you figured it out. – ThatsAMorais May 18 '15 at 17:07
0

Introduction

This answer is based on this blog post that I did not discover until today.

The steps required are:

  1. If there were separate branches for client, server, tester then flatten them (rebase or merge) so that a single commit has the latest version of all
  2. Split each of those directories into its own repository (without pathname!)
  3. Remove the files that have been split from the base project
  4. (Optional) remove history from base project
  5. (Optional) Re-add the new repositories as subtrees, submodules, or nested repos.

Splitting the directories

This can be done easily by using git subtree. Note: git-subtree might require installation.

To copy a directory to its own repository, where the directory's name will be removed from the path of the files in the new repository (i.e. tester/foo/bar.c becomes foo/bar.c in the new repo tester), use the following code. This is the code for tester; to do multiple splits at once I used a shell for although of course copy-pasting is also possible.

See also this thread, although in that thread the OP wants to move tester/foo/bar.c to ABC/foo/bar.c.

On the git host, where you keep your repos - I made it so that there was siblings Base, client, server, tester but it doesn't really matter:

cd /path/to/base
mkdir tester.git
cd tester.git
git init --bare

On the development system (use git remote -v to see your existing remote; I use ssh:)

git remote add sub_tester ssh://git_host/path/to/base/tester.git
git checkout master;    # assume master for this demonstration
git subtree split --prefix=tester -b split_tester
git push sub_tester split_tester:master
git branch -D split_tester

The subtree split command creates a new root in the same repo, copies all of the historical commits on master of files in that directory over to a branch on the new root, and finally merges that into master.


Removing the files that have been split

Now you can clear out the chaff:

git rm -r tester
git add -A
git commit -am "Removing tester which has been split to its own repo"

Removing the history

The new repos have their history in them, so it is not necessary to keep that history around in the base repo. In fact if you later re-add the new repos as subtrees then you've duplicated all that history. I removed the history with the following steps:

git checkout last_commit_to_keep
git checkout --orphan history
git commit -m "Squashed old history"
git replace last_commit_to_keep history
git branch -D history

Re-adding as subtree

Now you can add the separate repos as subtrees:

git subtree add --prefix-tester sub_tester master

Addendum (not really related): After doing this I decided that subtrees weren't such a good idea for my workflow after all. The big problem is that if you make a change on a branch in a subtree, then you have to merge that branch into the main part of your repo every time. This quickly leads to a history that looks like spaghetti when viewed in gitk.

Alternatively if you develop on master and then backport to the subtrees then there are no checks preventing you getting out of sync with the subtrees, and you'll probably have to keep merging anyway to make sure you stay in sync.

I ended up splitting as described above but then using submodules which turned out to suit my workflow better. The submodule update commits can be kept tidy by squashing them.

Of course, using nested repositories is another option but then there is no relative version control at all between the different repos. and it's possible to accidentally delete them with git clean. Probably what would suit me best is one of those power scripts that maintains common tags between nested repos.

Community
  • 1
  • 1
M.M
  • 138,810
  • 21
  • 208
  • 365