3

The BFG Repo Cleaner site gives an example of using the tool as follows to clean up a repository:

  1. Clone a fresh copy of your repo.

    $ git clone --mirror git://example.com/some-big-repo.git
    
  2. Run BFG to clean up your repo.

    $ java -jar bfg.jar --strip-blobs-bigger-than 100M some-big-repo.git
    
  3. Use git gc to strip out the unwanted dirty data

    $ cd some-big-repo.git
    $ git reflog expire --expire=now --all && git gc --prune=now --aggressive
    
  4. Push changes back up to the remote

    $git push
    

I understand the head branch is protected so any file in the head branch that is larger than 100M will still be there. If I run this tool as described I will lose any history of said 100M file correct? So if there is an old version of that file in an old commit it's gone and I will not be able to use it in it's previous state....correct?

Also, I have a coworker that stated the following and I am wondering if it is true:

If you push back to the repository that was mirrored in TFS the changes to your pack file won't be reflected on the remote and future clones

You have to create a new repository in TFS and push the mirror there for the remote to pick of the pack file changes.

Bill Greer
  • 3,046
  • 9
  • 49
  • 80

2 Answers2

5

Any file still present at the HEAD of the repo will be preserved, including the history. It's to protect you from making mistakes. The idea is that you should explicitly delete the file, commit the deletion, then clean up the history to remove it.

TFS does not gc its repos; your colleague is correct. See Team Foundation Server 2015 (tfs2015) run git gc --prune=now on orgin/remote for confirmation.

Daniel Mann
  • 57,011
  • 13
  • 100
  • 120
1

Shortly I also used the BFG Repo Cleaner to delete some folders from an git repo at TFS.

If you want to modify also the head, use parameter --no-blob-protection

Obviously, in the cleaned (old) commits the files which you cleaned are missing. The commits are still there but the file is missing in each corresponding commit. You will not be able to see the file history.

For safety reasons I would always rename the old repo and create a new one. Probably even with another repo Name so that my co-workers can't get the wrong repo merged into their working copy.

If you really want, it is possible to git push --all -force and rewrite the complete history on the TFS repo. But then the old history is gone.

milbrandt
  • 1,438
  • 2
  • 15
  • 20
  • I could be wrong, but I don't think a force push would handle the case of "dead" references hanging around in the remote repo, but I **think** specifying `--mirror` would – Daniel Mann Mar 30 '18 at 22:35
  • According to https://git-scm.com/docs/git-push "Usually, the command refuses to update a remote ref that is not an ancestor of the local ref used to overwrite it [..] This flag disables these checks, and can cause the remote repository to lose commits". I'm not sure if it will work if all commits (including the very first commit) have been rewritten, but otherwise it should do. It's like rebasing an very old branch with a lot of commits. – milbrandt Mar 31 '18 at 07:05