0

In this post I saw a method on how to clone an entire Git repository in a history preserving way. I am now trying to split two repositories into two pieces, so that one folder in my current repository should become its own independent repository. I would like to make it so that this folder was essentially cloned in a history preserving way, but such that only history concerning the files in that particular folder survives. Is that possible?

bahrep
  • 29,961
  • 12
  • 103
  • 150
mstaal
  • 590
  • 9
  • 25
  • First of all you may not keep full history (at least changes that are not related won't be in new repository). You need to look to `git filter-branch`. The idea is to copy somewhere your *.git* and filter it. – 0andriy Jan 13 '20 at 21:34
  • I *know* I’ve seen a way to do this... googling « split a repo » gave good hints, esp. https://help.github.com/en/github/using-git/splitting-a-subfolder-out-into-a-new-repository – D. Ben Knoble Jan 13 '20 at 21:40
  • Does this answer your question? [Detach (move) subdirectory into separate Git repository](https://stackoverflow.com/questions/359424/detach-move-subdirectory-into-separate-git-repository) – phd Jan 13 '20 at 22:57
  • History, in Git, *is* commits. There is no "history of files". There are only commits. You ask about some file, and Git walks through all its commits, to see if in *that* commit, the file is different from the copy of the file in the commit that comes right before it. The resulting list is just the subset of commits in the repository in which the file just changed. In other words, what you see is not *file* history, but rather a selected subset of *commit* history. – torek Jan 13 '20 at 23:56
  • Now, imagine you take each original commit, which is a full snapshot of all files. Throw out all but the one folder and make a new commit from this result. Do this with every original commit. The copied commits, which are "improved" by removing all but the files you care about, are a history. Keep these commits, throw out all the original commits, and you have a new and different history in which only those files exist. – torek Jan 13 '20 at 23:59
  • @torek this is exactly what the BFG Repo Cleaner does when you ask it to remove a file or folder. – jessehouwing Jan 14 '20 at 22:00
  • @jessehouwing: yes, it's also what filter-branch and the newfangled filter-repo do. I find it helps people to have a mental model of this process. – torek Jan 14 '20 at 22:59

1 Answers1

0

Clone the repo, then use the BFG repo cleaner to delete all the files in the other directory and all commits that are left empty because of that purge.

Then clone the original repo again and rinse and repeat for the other directory.

You can find the BFG repo cleaner here:

.

 git clone --mirror git://example.com/some-big-repo.git
 java -jar bfg.jar --delete-folders RemoveMe --delete-files RemoveMe
 cd some-big-repo.git $ git reflog expire --expire=now --all && git gc --prune=now --aggressive 
jessehouwing
  • 106,458
  • 22
  • 256
  • 341
  • Is this not mostly for deleting files of a certain size? – mstaal Jan 13 '20 at 21:47
  • It can be used for that. Or to remove the history of accidentally included (copyrighted/licensed) files or files with secrets, credentials etc. So it works really well for this scenario. – jessehouwing Jan 14 '20 at 12:02