3

I have a long standing git repo that ended up with a whole load of irrelevant files in it from another developer that was taking up huge amounts of storage, it was using something silly like 5gb as he had included resource files, there were 5000 PSD files in the repo.

I have remove all those files from the repo and added the folder to gitignore, i also went through and removed a bunch of plugins (wordpress site) and added them using wpackagist instead so they aren't committed to the repo (only the composer.json is).

So after all the clean up, removing cached files from the repo, adding everything to gitignore and then committing everything as a "cleanup" commit, when i come to push up to gitlab it's still adding up to around 5gb and i have no idea why since i've removed all the large files and folders.

Just wondering what i'm missing? It won't even push to the new repo on gitlab as it's just far too big and ends up cutting the connection off.

MMMWeirdo
  • 187
  • 1
  • 13
  • Have you confirmed that all of the unwanted files are removed from the history? Even if they are removed from the working copy, they may still be included in a prior commit that hasn't been pushed to the server. If this is the case, you will have to rebase/squash. – Jonathon S. Jan 17 '22 at 13:38
  • 1
    You also removed the history of the files? And also did a garbage collection? After all of that you have to push with force, because changing the past! All users using this repo have to rebase the changes. All this is not a typical workflow at all. – Klaus Jan 17 '22 at 13:38

2 Answers2

3

Because you still have your Git history, the files are still technically there, even if not on your latest branch.

You can remove all Git history for the repo and have the current state become the initial state:

As seen from: https://stackoverflow.com/a/26000395

  1. Checkout
  • git checkout --orphan latest_branch
  1. Add all the files
  • git add -A
  1. Commit the changes
  • git commit -am "commit message"
  1. Delete the branch
  • git branch -D main
  1. Rename the current branch to main
  • git branch -m main
  1. Finally, force update your repository
  • git push -f origin main

Also see: Make the current commit the only (initial) commit in a Git repository?

j6t
  • 9,150
  • 1
  • 15
  • 35
Michael Mintz
  • 9,007
  • 6
  • 31
  • 48
  • 2
    I edited the answer to emphasize that this solution removes all project history. Hope you agree. – j6t Jan 17 '22 at 15:29
  • It's also possible to remove large files from the history *and* keep the history (a rewritten version of it). – mkrieger1 Jan 17 '22 at 15:32
0

If you want to preserve all of your history, but just get rid of certain files, you should use something like git filter-repo (https://github.com/newren/git-filter-repo).

git filter-repo --invert-paths --path the/big/file/to/forget

Caveats about having good backups to prevent unexpected loss of data apply.

Rudedog
  • 4,323
  • 1
  • 23
  • 34