2

I know this is unusual, but one of my git repo, which acts as a collect-all repo, is getting too big and I'd like to split it up in two, into repoA and repoB.

I've found one way to do the splitting in "Forking a sub directory of a repository on GitHub and making it part of my own repo", however, that only tells about splitting, I want the history to be split as well, so repoA will only contain history of repoA and not repoB, and vice versa. Otherwise, I'll get two repos but double the size because of all the history it is keeping.

UPDATE:

Thanks to @ElpieKay pointing out to look at git filter-branch (instead of the git clean that I found when searching with "git purge"), I found this:

https://help.github.com/articles/splitting-a-subfolder-out-into-a-new-repository/

which is exactly what I was looking for. However, there is one more question -- how to remove repoA content from repoB? I.e., when splitting out repoA, I only need to do,

git filter-branch --prune-empty --subdirectory-filter sub1/sub2/sub3 master 

So in repoB, how to remove sub1/sub2/sub3 while keep everything else?

Moreover, this command in above doc,

 git remote set-url origin https://github.com/USERNAME/NEW-REPOSITORY-NAME.git

When I tried it, it only updated the fetch url, and the push one is still pointing to the old one. What I'm missing?

Vadim Kotov
  • 8,084
  • 8
  • 48
  • 62
xpt
  • 20,363
  • 37
  • 127
  • 216

1 Answers1

1

So in repoB, how to remove sub1/sub2/sub3 while keep everything else?

You do an index-filter:

cd repoB
git filter-branch --prune-empty --index-filter \
  'git rm -r -f -q --ignore-unmatch --cached sub1/sub2/sub3' --tag-name-filter cat -- --all

That will remove sub1/sub2/sub3 from repoA.

You will need a git push --force to update your upstream repo (and remove those folders from repoA)

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • Thanks @Von. I tried that, then use `git log --stat` to double check. However, there are still log entries in it that relate to the changes to `sub1/sub2/sub3`, though no associated files are listed in the log output. Are there more options we can add to completely remove all git logs that associated with `sub1/sub2/sub3`? Thx. – xpt Jul 27 '17 at 20:13
  • @xpt you can try the same command, but with `--prune-empty`, that is `git filter-branch --prune-empty ...` – VonC Jul 27 '17 at 20:15
  • Yes! `--prune-empty` does the trick! Please update the answer, and I'll accept it next. – xpt Jul 27 '17 at 21:32
  • @xpt I agree. I have included that command in the answer for more visibility. – VonC Aug 04 '17 at 14:20