In Git, files don't have history.
Instead, commits have history—or more correctly, the commits are the history. And, commits have files. But files don't have history. Hence, if you want history, what you want are commits, and those will bring files along for the ride.
The filter-branch
command will let you copy some or all commits within a repository. It starts by listing all the commits to copy. Then, for each such commit, it:
- extracts the commit;
- applies your filters; and
- makes a new commit from the filtered result.
Last, once it's filtered all your commits, it makes the filtered branch names (and optionally annotated-tags as well) point, not to the old commits (which are still in your repository), but to the new commmits instead.
If you filter all commits on all branches (and all tag names with --tag-name-filter cat
), then throw away the original branches, the result is a new repository with whatever changes your filter(s) made. Typically one might use git filter-branch
to remove all copies of some sensitive file (e.g., with passwords). That is, the filter says "if file dontcommit.txt exists, remove it".
If you write a filter that says "remove everything except file keep.txt", and use that to filter all commits in all branches and all tag names, the copied commits will have only the one file. Filter-branch can be told to discard "empty" commits, i.e., those that make no change after the filter has been applied, so by adding --prune-empty
you can toss out all the commits that affect other files.
Assuming you have a Unix-like system with a good find
command, this filter-branch should therefore do the trick. Note that filter-branch is very slow—extracting and re-making all the commits takes a significant amount of work in a big repository—and it should be done on a new clone of the repository (so that you cannot damage the original).
git filter-branch \
--tree-filter \
'find . -path . -o -path path/to/keep.txt -o -print0 | xargs -0 rm -f' \
--prune-empty \
--tag-name-filter cat \
-- --all
Replace path/to/keep.txt
with the one file you intend to keep, of course. (Note that this relies on the fact that Git does not save empty directories: the rm -f
will quietly fail on directories but will remove all their contents, except of course the one "keep" path that we made sure not to print.)
(Another option is to copy it out of the way, then remove everything,
then move it back into place:
cp path/to/keep.txt /tmp/keep.txt &&
rm -rf .??* * &&
mv /tmp/keep.txt .
which lets you change the location of the file within the temporary directory that git filter-branch
uses to make each new commit. The .??*
here is intended to remove files like .gitignore
and .gitattributes
: any dot-files at the top of the work-tree. Note that the filtering is done in a temporary directory that does not contain the .git
repository itself.)