0

We work with VS and VSTS for the most part.

Git: someone screwed our repo by deleting .gitignore and committing almost 2000 files meant to be ignored and bloated it by 250MB. We already cleaned it up (had to do cmd git commands), but doing a fresh git clone on this repo still downloads the extra 2000 files, before the deltas kick in and deletes those files.

Assuming that creating an entirely fresh new repo is not an option (business reasons), what would be the best way to make it so that a fresh git clone would not download the ignored 2000 files?

Some branches still have those 2000 uncleaned files, but not the master branch. We haven't deleted those branches yet for archiving purposes.

ZekiraDrake
  • 155
  • 1
  • 2
  • 10
  • What "do cmd git commands" did you apply? You can't just get files in the repo not to be cloned--if they're there, they will be cloned. The strategy you should be striving to implement is one that rids the repo of those files. Please give more detail about your branch and commands so far applied. What does "before the deltas kick in" mean??? – Jazimov Apr 27 '18 at 12:45
  • Use 'bfg repo cleaner' https://rtyley.github.io/bfg-repo-cleaner/ – Philippe Apr 27 '18 at 12:54
  • Possible duplicate of [Git cleanup/garbage collection on remote VSO git repository](https://stackoverflow.com/questions/44236321/git-cleanup-garbage-collection-on-remote-vso-git-repository) – Daniel Mann Apr 27 '18 at 17:42
  • @Jazimov i forgot exactly but it was basically a some git rm that removed everything and then a git add with parameters that excluded things in the .gitignore. When we pushed that to the remote, the 2000 files were deleted from master. – ZekiraDrake Apr 28 '18 at 12:23

1 Answers1

0

For the reason why a fresh git clone on this repo still downloads the extra 2000 files is caused by the person who delete .gitignore also push the changes to remote repo. So you need to remove the commit from remote repo.

Below are some options you can follow based on your situation:

First, check the commit history per branch and find the commit which delete .gitignore file:

In VS -> Team Explorer -> Branches -> right click a branch -> View History -> get the commit which delelet .gitignore (you can view a commit's detail by right click on the commit, and select view commit detaiils). Assume the commit Id is commit C (as below graphs).

  • If the commit which deleted .gitignore is the latest version on the branch (as below commit history):

    ...---A---B---C   branchname
    

    Then you can reset the commit and force push to your remote repo:

    git checkout branchname
    git reset --hard HEAD~
    git push -f
    

    The commit history on the branch will be:

    ...---A---B    branchname
    
  • If the commit which deleted .gitignore is not the latest version on the branch (as below commit history):

    ...---A---B---C---D---...---E   branchname
    

    Then you need to remove the commit C and rebase following commits:

    git checkout branchname
    git checkout <commit id for C>
    git reset --hard HEAD~
    git rebase --onto HEAD <commit id for C> branchname
    git push -f
    

    Then the commit history on the branch will be:

    ...---A---B---D'---...---E'   branchname
    
Marina Liu
  • 36,876
  • 5
  • 61
  • 74
  • My scenario is the second one. If I reset --hard on Commit C and then rebase, would the files in branch E still be safe? I'm not familiar with git notation so I'm not sure what the ' means in that case. Also, do I need to do this with all branches or can I just do this on master (the default branch)? Our main objective is to make sure that new, fresh clones of the repo won't include the extra 2000 files. – ZekiraDrake Apr 30 '18 at 03:45
  • Also, for example, i do this on say a develop_6 branch; can I just merge that to master to eliminate the commit that deleted .gitignore? Though the commit ID is the same across both branches because of the merging we did, so I'm hoping that I only need to do this once on develop_6 and then merge up to master? – ZekiraDrake Apr 30 '18 at 04:00
  • You need to execute the commits for all branches so that the 2000 files won't be record in any of a commit. And it you only execute the commands on develop_6 and merge into master the 2000 files still be tracked in version control. – Marina Liu Apr 30 '18 at 04:41