10

I have a largish (32k commits) git repository in which I need to rewrite history in one branch to remove a bunch of large files as described by a .gitattributes file. This branch is entirely local and has never hit a remote (in fact our remote is rejecting it because of the large files in history).

I know that the following command will go through the history of the branch and remove all .dll files:

$ git lfs migrate import --include='*.dll'

but since the .gitattributes file exists and is rather extensive, is there a command that simply replays the work that would have been done to pointer-ize those files, if the .gitattributes file had existed back when the branch was created?

Adam Smith
  • 52,157
  • 12
  • 73
  • 112
  • The use case here is a migration of a mature code base from a CVCS to Git. We've imported all the commits in history (a process that took ~20hrs) but failed to create a `.gitattributes` before doing so. – Adam Smith Sep 30 '19 at 07:40
  • There's a repo-local `.git/info/attributes` file that applies every time, I don't know whether lfs's `--fixup` honors it but I think it should do that. – jthill Jun 21 '20 at 20:21

1 Answers1

3

I would first insert the correct .gitattributes at the beginning of the branch (using git rebase for example) :

*--*--x--*--*--*--*--* <- master
       \
        *--*--*--*--*--*--a--* <- my/branch
                          ^
                          commit with updated .gitattributes

# with the commits identified as above, from branch my/branch, run :
$ git rebase -i x
   ...
   # in the opened editor, move commit 'a' at the beginning of the list
   # save & close

# you should obtain :
*--*--x--*--*--*--*--* <- master
       \
        a'--*--*--*--*--*--*--* <- my/branch (rewritten)
        ^
      rewritten commit

After that :

You can use git filter-branch --tree-filter to have git replay the commits one after another, and apply the filters described in .gitattributes :

# first arg is the name of a script to execute on each commit
#   you have nothing to edit : just use 'true' as an action
#   the only action you expect is that git applies the filters
#   when re-staging files for each individual commit

git filter-branch --tree-filter true a'..my/branch

You may want to add the --prune-empty option, or you can remove the empty commits after the rewriting, using once again git rebase -i for example.

LeGEC
  • 46,477
  • 5
  • 57
  • 104