Using git-filter-repo
git filter-repo is recommended by the git project over git filter-branch
.
git filter-repo --strip-blobs-bigger-than 1M
Using BFG Repo-Cleaner
The older BFG Repo-Cleaner used to be the most popular tool to do exactly that.
To remove all files with a size > 1 MB:
$ bfg --strip-blobs-bigger-than 1M my-repo.git
By default it will not touch your current files.
Don't use git filter-branch
git filter-branch has a plethora of pitfalls that can produce non-obvious manglings of the intended history rewrite (and can leave you with little time to investigate such problems since it has such abysmal performance). These safety and performance issues cannot be backward compatibly fixed and as such, its use is not recommended.
Source
Second question: how to keep specific files from being stored in the history
You can add files to .gitignore
so that they are never added in the first place, but Git cannot be configured to delete them automatically, so you would need some kind of hook that automatically executes bfg
or git-filter-repo
.
Better to prevent the problem in the first place
Tools like bfg
are meant for rare exceptions. Ideally, you should prevent large binary files from being included in the repository in the first place. Instead, there are many other ways to preserve them, for example to add them to a GitHub release or upload them to a package repository depending on your environment, such as npm, a Maven repository or GitHub packages.