479

Some time ago I added info(files) that must be private. Removing from the project is not problem, but I also need to remove it from git history.

I use Git and Github (private account).

Note: On this thread something similar is shown, but here is an old file that was added to a feature branch, that branch merged to a development branch and finally merged to master, since this, a lot of changes was done. So it's not the same and what is needed is to change the history, and hide that files for privacy.

Alexis Wilke
  • 19,179
  • 10
  • 84
  • 156
Marcos R. Guevara
  • 5,258
  • 6
  • 19
  • 44

9 Answers9

476

I have found this answer and it helped:

git filter-branch --index-filter 'git rm -rf --cached --ignore-unmatch path_to_file' HEAD

Found it here https://myopswork.com/how-remove-files-completely-from-git-repository-history-47ed3e0c4c35

João Pimentel Ferreira
  • 14,289
  • 10
  • 80
  • 109
Petro Franko
  • 5,603
  • 1
  • 17
  • 18
  • 56
    Warning: This creates a ton of commits and causes divergence. You probably have to force push after, but I was too scared. – sudo Aug 27 '19 at 05:54
  • 5
    Seconding what @sudo said but this did work for my fresh branch that I accidentally committed `.env` to. Quick and to the point solution. – Joe Scotto Apr 10 '20 at 20:16
  • 2
    Indeed, a simple force push works! I was also scared but backed everything up. – wutBruh Jul 04 '20 at 17:09
  • 12
    You can also specify a range of commits as the last argument. If the commit in question was recent, do `..HEAD` and save some time. – Victor Sergienko Nov 24 '20 at 00:53
  • 7
    after this it works only for me `git push --force` – Sebastian Jan 04 '21 at 16:43
  • You could see this command [in Git's manpage](https://git-scm.com/docs/git-filter-branch#_exampleshttps://git-scm.com/docs/git-filter-branch#_examples) also, which provides other examples, and some important notes as well. – MAChitgarha Feb 15 '21 at 17:24
  • 2
    Didn't work, and the commits i made are now on the revision history... tried many approaches... – marcolopes May 11 '21 at 22:12
  • @marcolopes I guess you did something wrong, you also have to consider doing `git push --force` – Petro Franko May 12 '21 at 09:54
  • 1
    I did everything just like the examples here and in other sources... and i did the git push --force. Tried many times. No success! File is still there and still in the revision history... :\ Now i have a huge history for that file. – marcolopes May 13 '21 at 20:40
  • 1
    One of those commands you don't understand but it works wonders ! – renatodamas Aug 31 '21 at 17:28
  • 2
    This did not remove the file from my repo, it remains as it is – alper Sep 23 '21 at 13:32
  • 26
    Current versions of Git say this about `filter-branch`: "WARNING: git-filter-branch has a glut of gotchas generating mangled history rewrites. Hit Ctrl-C before proceeding to abort, then use an alternative filtering tool such as 'git filter-repo' (https://github.com/newren/git-filter-repo/) instead. See the filter-branch manual page for more details; to squelch this warning, set FILTER_BRANCH_SQUELCH_WARNING=1." – Ryan Lundy Nov 28 '21 at 06:55
  • 1
    @sudo It did not add even a single commit for me and worked perfectly fine. What are you talking about? A ton of commits?! What am I missing? – aderchox Dec 14 '21 at 20:48
  • 2
    @aderchox it doesn't exactly "add" commits, it rewrites existing ones. Those commits get replaced by new ones, with a different hash number – ChoKaPeek Jan 10 '22 at 16:27
  • Even @sudo was scared, and there's a good reason for it. Had to merge a branch into `main`/`master` by following https://stackoverflow.com/a/4624383/929999 after this. I guess it does the job but as the warning says, do it at your own risk :) – Torxed May 09 '22 at 18:09
  • if anyone doing this, it is major operation but does seem to work, just also know after operation have to do git push origin --force --all on main and before that probably in github have to allow forced pushes on that branch would would normally be protected. then have all engs rebase main. Seems it closes all outstanding PRs also. – bjm88 Jun 22 '22 at 03:43
  • 1
    This doesn't really remove the file from the remote, right? Best-case it will get garbage-collected at some point... Is there a more thorough method if you have access to the remote itself? – PieterNuyts Sep 23 '22 at 07:20
  • this worked for my single master branch. – StealthTrails Nov 29 '22 at 08:23
  • 1
    I had to add `--force` after `filter-branch` before anything worked. Also added `HEAD~2..HEAD` so it only did the last 2 commits where the change was made and not every commit. – MomasVII Dec 15 '22 at 23:43
  • 1
    @sudo you can create a bogus branch to test it. It's definitely not scary. – maxadamo Feb 12 '23 at 15:41
202

git-filter-repo

git recommends using the third-party add-on git-filter-repo (when git filter-branch command is executed). There is a long list of reasons why git-filter-repo is better than any other alternatives, my experience is that it is very simple and very fast.

This command removes the file from all commits in all branches:

git filter-repo --invert-paths --path <path to the file or directory>

Multiple paths can be specified by using multiple --path parameters. You can find detailed documentation here: https://www.mankier.com/1/git-filter-repo

mikemaccana
  • 110,530
  • 99
  • 389
  • 494
Tibor Takács
  • 3,535
  • 1
  • 20
  • 23
  • 7
    i get error: git: 'filter-repo' is not a git command. See 'git --help'. – cikatomo Mar 19 '21 at 18:51
  • 19
    @cikatomo It's a third-party tool, you have to install it https://github.com/newren/git-filter-repo/blob/main/INSTALL.md – Vladimir Jovanović Mar 22 '21 at 12:10
  • This was the answer that helped me with the simple case of removing a couple of specific files from a repo. – jpw Apr 24 '22 at 09:07
  • 4
    This worked but it removed `.git` so I wonder why not just remove .`git` manually and re-init? – chovy May 02 '22 at 11:22
  • to install on arch use `yay -S git-filter-repo` – chovy May 02 '22 at 11:23
  • 4
    This should be set as the new best answer as it is more up-to-date. – GuyStalks May 20 '22 at 13:44
  • note: paths are relative to the repo root – milahu Jun 29 '22 at 19:02
  • 6
    @cikatomo Another way to install `pip install git-filter-repo`. – Kaushal Modi Aug 29 '22 at 17:18
  • @KaushalModi if I installed the plugin with `pip`, what then next? – Timo Sep 08 '22 at 18:51
  • @Timo The pip install would have installed the `git-filter-repo` executable. Just ensure that it's in the PATH. Then `git filter-repo ..` will work. – Kaushal Modi Sep 08 '22 at 22:18
  • how to also update remote repo after this? – João Pimentel Ferreira Sep 13 '22 at 16:03
  • @João Pimentel Ferreira You must force push into the remote. Be careful with force push, only do it if you know that this is that you want to :) – Tibor Takács Sep 14 '22 at 20:28
  • if you are having trouble installing git-filter-repo check this out: https://stackoverflow.com/a/69356543/713847 – SZT Dec 26 '22 at 08:47
  • this re-wrote the entire history. When I pushed my branch to origin, it's completely diverged from the rest of the git history. It would work if I pushed all branches to origin, but I don't want to do that. – MuhsinFatih May 31 '23 at 21:25
  • @MuhsinFatih Of course, this is how git works: when some files are removed from the commit, this is an entirely new commit. There is no other way to remove files from the git repo than rewriting the history. (This is not specific for this solution.) – Tibor Takács Jun 05 '23 at 11:08
  • @TiborTakács I meant that I imagined it will only diverge from the first occurrence of the file, but the entire history of all branches was re-written. I think that this may not be practical if the changes are already pushed to remote. I ended up cherry picking the commits where I pushed the offending file – MuhsinFatih Jun 06 '23 at 18:09
  • 1
    @chovy to keep all of your commit history – Rokit Jul 21 '23 at 00:59
  • 1
    And as @chovy points out, this seems to break non-bare clones (you get a warning if its not bare and you `--force` it). You need to add the remote again afterwards to push your branch. In my case anyway. – oarfish Jul 25 '23 at 14:27
160

If you have recently committed that file, or if that file has changed in one or two commits, then I'd suggest you use rebase and cherrypick to remove that particular commit.

Otherwise, you'd have to rewrite the entire history.

git filter-branch --tree-filter 'rm -f <path_to_file>' HEAD

When you are satisfied with the changes and have duly ensured that everything seems fine, you need to update all remote branches -

git push origin --force --all

Note:- It's a complex operation, and you must be aware of what you are doing. First try doing it on a demo repository to see how it works. You also need to let other developers know about it, such that they don't make any change in the mean time.

einpoklum
  • 118,144
  • 57
  • 340
  • 684
hspandher
  • 15,934
  • 2
  • 32
  • 45
  • 1
    after rewrite the entire history, for keep the changes to repository (github) what must be done? – Marcos R. Guevara May 03 '17 at 14:30
  • thank you, i will wait for do it, and try it with a demo repository, i will update with all was done here. – Marcos R. Guevara May 03 '17 at 14:49
  • By mistake, I forgot to add `--all`. Now it says everything up-to-date whenever I rerun push with both the arguments. And the file is not removed from other branches. What should I do now? – MrObjectOriented Jun 29 '19 at 11:14
  • 3
    Why does your suggestion use `--tree-filter` rather than `--index-filter` like in @PetroFranko's answer? – einpoklum Jun 24 '20 at 13:10
  • 3
    holy crap, it worked! I mean it was really really simple. I've done it the hard way before, but this was much easier. Tip: the path needs to be relative. – Antebios May 08 '21 at 03:07
  • Didn't work :\ File is still on local repo and after "git push" still on the git remote repository, and the revisions are all there! :\ – marcolopes May 11 '21 at 22:48
  • @einpoklum basically, `tree-filter` rebuilds everything to (and then from) a new (temporary) directory. `index-filter` does this differently (in memory I believe) and is considerably faster. See more here: https://stackoverflow.com/questions/36255221/what-is-the-difference-between-tree-filter-and-index-filter-in-the-git – timhc22 Feb 24 '22 at 02:53
  • It does not seems to be working for me. Doing the same command exactly mentioned here but still I can see the files in remote repo. Anything I am missing here? – Gaurav Parek Mar 28 '22 at 13:15
  • i still see it in history – chovy May 02 '22 at 11:19
  • The `git filter-branch` way would require action by all users of the repository, wouldn't it? – matanster Jun 16 '22 at 14:13
  • If ```Cannot create a new backup. A previous backup already exists in refs/original/ Force overwriting the backup with -f``` you can remove backup with ```rm -rf .git/refs/original/refs/heads/``` also for removing multiple files at one express files full path as rm arguments. For example: ```git filter-branch --tree-filter 'rm -f .gitignore out.csv' HEAD``` – EsmaeelE Jul 12 '22 at 06:52
  • `WARNING: git-filter-branch has a glut of gotchas generating mangled history rewrites. Hit Ctrl-C before proceeding to abort, then use an alternative filtering tool such as 'git filter-repo' (https://github.com/newren/git-filter-repo/) instead. See the filter-branch manual page for more details; to squelch this warning, set FILTER_BRANCH_SQUELCH_WARNING=1.` – Jake Oct 14 '22 at 02:44
62

Remove the file and rewrite history from the commit you done with the removed file(this will create new commit hash from the file you commited):

there are two ways:

  1. Using git-filter-branch:

git filter-branch --force --index-filter 'git rm --cached --ignore-unmatch <path to the file or directory>' --prune-empty --tag-name-filter cat -- --all

  1. Using git-filter-repo:
pip3 install git-filter-repo
git filter-repo --path <path to the file or directory> --invert-paths

now force push the repo: git push origin --force --all and tell your collaborators to rebase.

alper
  • 2,919
  • 9
  • 53
  • 102
suhailvs
  • 20,182
  • 14
  • 100
  • 98
  • 1
    @alper you need to replace `PATH-TO-YOUR-FILE-WITH-SENSITIVE-DATA` with the file to remove eg: `README.md` if you want to remove it. – suhailvs Sep 04 '21 at 09:14
  • You need to use `-rf` in order to remove folders – alper Sep 04 '21 at 20:20
  • 10
    For `git filter-repo`: I am getting following message : `Aborting: Refusing to destructively overwrite repo history since this does not look like a fresh clone. (expected freshly packed repo) Please operate on a fresh clone instead. If you want to proceed anyway, use --force.`. If I force it I get following: `fatal: 'origin' does not appear to be a git repository fatal: Could not read from remote repository. ` – alper Sep 06 '21 at 10:46
  • `git filter-branch` worked for me! – Federico Peralta Sep 21 '21 at 04:14
  • 3
    `git filter-branch` approach worked for me on mac, while `filter-repo` approach was removing remote origin – Ilya Sheershoff Jan 25 '22 at 09:28
  • 2
    This worked, but I forgot to back up the file first, and now it's gone. :-( – kr37 Apr 30 '22 at 15:43
  • I have the same problem of @IlyaSheershoff, filter-repo remove remote origin, anyone know why? – Richard Aguirre Jan 25 '23 at 18:32
43

I read this GitHub article, which led me to the following command (similar to the accepted answer, but a bit more robust):

git filter-branch --force --index-filter "git rm --cached --ignore-unmatch PATH-TO-YOUR-FILE-WITH-SENSITIVE-DATA" --prune-empty --tag-name-filter cat -- --all
vancy-pants
  • 1,070
  • 12
  • 13
15
  • First of all, add it to your .gitignore file and don't forget to commit the file :-)

  • You can use this site: http://gitignore.io to generate the .gitignore for you and add the required path to your binary files/folder(s)

  • Once you added the file to .gitignore you can remove the "old" binary file with BFG.


How to remove big files from the repository

You can use git filter-branch or BFG. https://rtyley.github.io/bfg-repo-cleaner/

BFG Repo-Cleaner

an alternative to git-filter-branch.

The BFG is a simpler, faster alternative to git-filter-branch for cleansing bad data out of your Git repository history:

*** Removing Crazy Big Files***

  • Removing Passwords, Credentials & other Private data

Examples (from the official site)

In all these examples bfg is an alias for java -jar bfg.jar.

# Delete all files named 'id_rsa' or 'id_dsa' :
bfg --delete-files id_{dsa,rsa}  my-repo.git

enter image description here

CodeWizard
  • 128,036
  • 21
  • 144
  • 167
  • Is it a third party cleaner? – alper Sep 03 '21 at 13:39
  • Is it secure to use? – alper Sep 04 '21 at 22:08
  • Indeed, a very "old" tool which is being used by the community for few years. The source is in GitHub so you and the community can browse it. – CodeWizard Sep 05 '21 at 13:59
  • I just find out that GitHub does not remove deleted commits in case when users request them to run garbage collector, (https://stackoverflow.com/questions/34582480/remove-commit-for-good/34594815#34594815). I am just get lost where when we use 3rd party tools like GitHub whatever committed, we will always need to ask them to remove it, which is not cool – alper Sep 05 '21 at 16:56
10

Using the bfg repo-cleaner package is another viable alternative to git-filter-branch. Apparently, it is also faster...

c1au61o_HH
  • 867
  • 7
  • 14
2

The following commands should be applied one by one in each project in order to remove the history for a specific file, but you have to take backup from the project at the beginning, because the file will be deleted

  • git filter-branch --index-filter "git rm --cached --ignore-unmatch ProjectFolderName/src/main/resources/application-prod.properties" HEAD

  • git push origin --force --all

  • git update-ref -d refs/original/refs/heads/master

..........................................................................

  • git filter-branch --index-filter "git rm --cached --ignore-unmatch ProjectFolderName/src/main/resources/application.properties" HEAD

  • git push origin --force --all

  • git update-ref -d refs/original/refs/heads/master

1

Remove file(s)

bfg --delete-files YOUR-FILE-WITH-SENSITIVE-DATA

Replace all text listed in passwords.txt wherever it can be found in your repository's history, run:

bfg --replace-text passwords.txt

After that you need topush your changes to GitHub/GitLab/BitBucket

git push --force

More about the BFG tool here

Furthermore, since this technique it will rewrite your repository's history, which changes the SHAs for existing commits, you should alter and any dependent commits. So merge and close all open PRs!

Panagiss
  • 3,154
  • 2
  • 20
  • 34