Adding to what jdsumsion said, subtree-merge (or git subtree
, which does the same thing in one step) won't work, since all it does is give you a merge commit that moves all files from the root to your sub-directory. In order to have your file history be maintained, the file would need to always have been at its final location, which would require rewriting all previous commits.
So the way you do that is you don't use git filter-branch
, because this is one little bash script that very much does not want you to use it. You should use git-filter-repo instead.
The procedure just involves fetching the external project as its own remote, as with the subtree merge, then making a local tracking branch and rewriting all commits on that branch to have retroactively always used the path you want. You can then just merge the branch into your main project with the unrelated-histories
flag.
The use of bash variables is mainly for ease of reuse and readability. I don't expect this to work if you want your sub-directory to contain spaces and the like, but it should be fairly easy to adjust by hand in that sort of case.
export SUBTREE_PREFIX="MySubproject"
git remote add -f "${SUBTREE_PREFIX:?}-remote" https://my-git-repo.invalid/Subproject.git
git checkout "${SUBTREE_PREFIX:?}-remote"/master -b "${SUBTREE_PREFIX:?}-master"
# --force is to skip the "freshly cloned repo" check.
# All the refs we'll be operating on are fresh, even if the repo isn't
# Remove --dry-run once you've checked .git/filter-repo/fast-export.filtered
# to be sure that everything is correct.
git filter-repo --refs "${SUBTREE_PREFIX:?}-master" --to-subdirectory-filter "${SUBTREE_PREFIX:?}" --force --dry-run
git checkout master
git merge "${SUBTREE_PREFIX:?}-master" --allow-unrelated-histories
# Repeat for however many repos you need to add
Speaking for myself, given how the entire point of the manipulation is to group the commit history of multiple repositories into one, I would also want to prefix the commit messages with which subproject these are from, so I can keep track afterwards.
git filter-repo --refs "${SUBTREE_PREFIX:?}-master" --to-subdirectory-filter "${SUBTREE_PREFIX:?}" --message-callback="return message if message.startswith(b'${SUBTREE_PREFIX:?}:') else b'${SUBTREE_PREFIX:?}: ' + message" --force --dry-run
Also, some git servers will deny your branch if you attempt pushing commits that were not committed by you. git rebase
normally sets the committer to you while leaving the commit author intact, but here you need to do it manually.
git filter-repo --refs "${SUBTREE_PREFIX:?}-master" --to-subdirectory-filter "${SUBTREE_PREFIX:?}" --commit-callback '
commit.committer_name = "You"
commit.committer_email = "your@email.example"
' --message-callback="return message if message.startswith(b'${SUBTREE_PREFIX:?}:') else b'${SUBTREE_PREFIX:?}: ' + message" --force --dry-run
Keep in mind that, unlike with git subtree
or a submodule, you won't be able to separately maintain the standalone and altered copies of the project, since they will no longer any history. If this is a third party library you're trying to keep a vendored, up-to-date copy of in your tree, you will find that merging the upstream changes isn't really possible.