... when I run git log --oneline --branches -- <large_file_name>
, there are no commits which reference the file, which may be because I rewrote the history of commits ...
That's fine (assuming that's your intent). What you need to do now is make sure no other external references reach commits that use the file(s).
Using --branches
tells git log
or git rev-list
1 to look at all branch name references, i.e., everything under refs/heads/
. But there may be tag name references under refs/tags/
, so you should check there. There may even be other references, so you should check all of them. The easiest way to do that is to use --all
rather than --branches
: that looks at all references.
But this also misses the reflogs. Every reference has (at least potentially) a reflog. To walk the reflogs, use -g
or --walk-reflogs
. Note that you must do this separately. If there's a reflog entry that references the commit, you can expire it manually; or you can use the brute-force method of just expiring all reflogs wholesale (which is a little dangerous since the reflogs are your main safety net, but you are doing all this on a copy of the original repository, right? :-) ).
Note that when you use git filter-branch
to "rewrite history", you're really copying all of history to a new history. As such, you can temporarily increase the repository size up to about double, depending on what you do in your filters. Removing old reflogs and removing the saved original references under the refs/original/
namespace, followed by garbage collection, should shrink things back to size.
Note also that if a pack file has a corresponding .keep
file, Git won't throw out the kept pack even after building a new pack that covers everything. Any .keep
files were created manually and must be removed manually if and when that's appropriate.
1These two commands, git log
and git rev-list
, are actually pretty much just one command, built from one source file, builtin/log.c
. They have slightly different entry points, that set up some different default options, and git log
will start from HEAD
if you don't name any other starting points, while git rev-list
demands some starting points.