I'm looking for a way to set up git respositories that include subsets of files from a larger repository, and inherit the history from that main repository. My primary motivation is to be able to share subsets of the code via GitHub.
I currently manage my research-related (mostly Matlab) code via a single git repository. The code itself is loosely organized into a handful of folders, with code dependencies that often cross over folders. I don't want to upload a remote copy of the whole repository, because it includes a lot of mixed projects that no one else would want in its entirety.
My mental picture of this involves a separate repository for each project that tracks only the relevant files for that project, but inherits all the commits from the main repository. Ideally, I'd like to be able to tag versions within these sub-repositories separate from the main one, but that's not a necessity. I've looked into git submodules, subtrees, and gitslave, but all of these seem to assume that the subprojects are isolated collections of files, while in my case many subprojects share files with other subprojects. I also attempted to create a project-specific branch, git rm
-ing irrelevant files, but that fell apart as soon as I needed to merge changes from the main branch into the project branch (a mess of conflicts due to changes in project-deleted files).
The stats:
- 8096 files in main repository
- 14 subprojects I want to share
- 394 total files in those subprojects
- 276 files belong to only 1 project, 57 to 2, 60 to 3, and 1 to 6.
I currently share code by simply copying the relevant files to a new folder periodically for each project. But this means that the new copies have no commit history attached. Is there a more robust method of sharing these various subsets of code, and keeping them up to date with changes I make?