I've been experimenting with using git subtree and have run into the following situation.
I used git subtree to add an external project to my repo, I intentionally kept all of the history for the upstream project as I want to be able to refer to the project's history and also contribute back to the upstream project later.
As it turns out, another contributor to the upstream project accidentally pushed a large file into the master branch. To fix this, the upstream project rewrote history and force pushed onto master. When creating my "monorepo", I included this commit and I would also like to remove it.
How can I update my repository to reflect the new history of the subtree?
My first attempt was to use filter-branch to completely remove the subtree and all history.
git filter-branch --index-filter 'git rm -rf --cached --ignore-unmatch upstream-project-dir' --prune-empty HEAD
Once the old version of the subtree was removed, I could re-add the subtree using the new upstream master. However, this didn't work because for some reason the commit history still shows up in the git log output.
Update
I've wrote up the steps to create a minimally reproducible example.
First create an empty git repo.
git init test-monorepo cd ./test-monorepo
Create an initial commit.
echo hello world > README git add README git commit -m 'initial commit'
Now add a subtree for an external project.
git remote add thirdparty git@github.com:teivah/algodeck.git git fetch thirdparty git subtree add --prefix algodeck thirdparty master
Make some commits on the monorepo
echo dont panic >> algodeck/README.md git commit -a -m 'test commit'
Now attempt to use git filter-branch to remove the subtree.
git filter-branch --index-filter 'git rm -rf --cached --ignore-unmatch algodeck' --prune-empty HEAD
Examine git log output, I am expecting to see only my initial commit.
git log