I have a Subversion server with a few different projects in the standard layout like so:
ProjectA/
trunk/
branches/
tags/
ProjectB/
trunk/
FolderOfBinaries/
SourceFolderA/
SourceFolderB/
SourceFolderC/
branches/
tags/
v1.0/
v1.1/
v2.0/
ProjectC/
trunk/
branches/
tags/
ProjectB is going to be be migrated to Git, but not with a standard clone. I want to split the project into two Git repositories - one for the folder full of large binaries that change relatively often and another repository for everything else. I did a clone of the repository in full and it's a few GBs, but the binaries folder is probably 90% of that, and running git gc
takes a long time. I'd rather have a small fast repository and then add the binaries folder as a submodule if the developer requires it.
I've found two potential options so far. First, I could use git branch-filter
to try and remove the folder of binaries from the history as shown in the Git Book. Second, I could use svndumpfilter
to split the current Subversion repository into two and then git svn clone
each separately.
My question is though, what will happen to all the history, and particularly the branches and tags? I'd still like to know what the folder of binaries looked like at every tag in the project, even though the binaries may not have changed between two tags. is that possible?
Edit: The folder of binaries is not full of build artefacts (*.class, *.o, *.dll etc) so I can't just strip it out and make them external. It's full of binaries that are output from a third-party program that need to be versioned (think OpenOffice documents, Photoshop files etc.).