16

I am using a shared github repository to collaborate on a project. Because i am an idiot, I committed and pushed a script file containing a password which I don't want to share (Yes, i can change the password, but I would like to remove it anyway!).

Is there any way to revert the commits from github's history, remove the password locally and then recommit and push the updated files? I do not want to remove the file completely, and I would rather not lose the commit history on github.

(This question How can I completely remove a file from a git repository? shows how to remove a sensitive file, but not how to edit sensitive data from a file, so this is not a duplicate)

doctorer
  • 1,672
  • 5
  • 27
  • 50
  • 1
    Does this answer your question? [How to substitute text from files in git history?](https://stackoverflow.com/questions/4110652/how-to-substitute-text-from-files-in-git-history) – Daniel Mann Jan 21 '20 at 23:30
  • *I would rather not lose the commit history on github*: To be clear, you still want the commit history to contain the viewable revision of the file with the password? – Gino Mempin Jan 21 '20 at 23:53
  • No - sorry i was uncleae - i want the rest of the commit history but with teh password removed – doctorer Jan 21 '20 at 23:55

3 Answers3

18

I would recommend to use the new git filter-repo, which replaces BFG and git filter-branch.

Note: if you get the following error message when running the above-mentioned commands:

Error: need a version of `git` whose `diff-tree` command has the `--combined-all-paths` option`

it means you have to update git.


First: do that one copy of your local repo (a new clone)

See "Content base filtering":

At the end, you can (if you are the only one working on that repository) do a git push --force

If you want to modify file contents, you can do so based on a list of expressions in a file, one per line.
For example, with a file named expressions.txt containing:

p455w0rd
foo==>bar
glob:*666*==>
regex:\bdriver\b==>pilot
literal:MM/DD/YYYY==>YYYY-MM-DD
regex:([0-9]{2})/([0-9]{2})/([0-9]{4})==>\3-\1-\2

then running

git filter-repo --replace-text expressions.txt
# on Windows
git-filter-repo --replace-text expressions.txt
  ^^^

will go through and replace:

  • p455w0rd with ***REMOVED***,
  • foo with bar,
  • any line containing 666 with a blank line,
  • the word driver with pilot (but not if it has letters before or after; e.g. drivers will be unmodified),
  • the exact text MM/DD/YYYY with YYYY-MM-DD and
  • date strings of the form MM/DD/YYYY with ones of the form YYYY-MM-DD.

gaborous adds in the comments:

On Windows, git-filter-repo works as a separate Python module (that you can install as such using pip install), so you need to add a dash in the above command for it to work on Windows:

git-filter-repo --replace-text expressions.txt
VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • 4
    Note to self: That was my **23000th answer** on Stack Overflow (in 136 months), 5 months after the [22000th answer](https://stackoverflow.com/a/57387243/6309). Before that: [21000th answer](https://stackoverflow.com/a/54856158/6309), [20000th answer](https://stackoverflow.com/a/51915541/6309), [19000th answer](https://stackoverflow.com/a/49421565/6309), [18000th answer](https://stackoverflow.com/a/46860745/6309), [17000th answer](https://stackoverflow.com/a/43703956/6309), [16000th answer](http://stackoverflow.com/a/40698777/6309), [15000th answer](http://stackoverflow.com/a/37539529/6309) ... – VonC Jan 22 '20 at 12:57
  • 1
    Works like a charm! On Windows, `git-filter-repo` works as a separate Python module (that you can install as such using `pip install`), so you need to add a dash in the above command for it to work on Windows: `git-filter-repo --replace-text expressions.txt` – gaborous Aug 18 '22 at 22:32
  • 1
    @gaborous Thank you for the feedback. I have included your comment in the answer for more visibility. – VonC Aug 19 '22 at 05:56
  • I appreciate your follow-up on this old question, thank you :-) Yes feel free to add my comment in your answer! Cheers! – gaborous Aug 19 '22 at 21:06
1

If your content had already been pushed to GitHub, after scrubbing the repository with git filter-repo or bfg and force-pushing the cleaned up repository, reach out to GitHub Support. They will then make sure all references to the commit and it's files are deleted from issue references, pull requests and the cached data GitHub keeps. Only then the password will really be gone from your repositories.

If anyone forked your repository and synced in the sensitive commit, then there is no way to force GitHub to clean up their repositories too. You'll need to ask each owner of the forks to go through the same process.

Consider your password burned. Since your password is out there and since it will take some time to be fully removed, there is ample time for a bad actor to scrape your current repo state and store the password for later use. Always reset the password. Do not fall for the trap of thinking you may still be safe.

Make sure any other contributors on your project clone a fresh copy or rebase there local changes on the fixed repository. Removing data from history will cause the commit-ids of all subsequent commits to change.

jessehouwing
  • 106,458
  • 22
  • 256
  • 341
0

Use BFG : https://rtyley.github.io/bfg-repo-cleaner/

To remove files:

$ bfg --delete-files <file to remove>  my-repo.git

enter image description here


You can also use this tool to remove passwords and ant sensitive data as well.

Prepare a replacement file with the content you wish to replace and use BFG to clean it out.

bfg --replace-text passwords.txt  my-repo.git

# Example of the passwords.txt file: 
string1                   # Replace string ***REMOVED***' (default text)
string2==>replacementText # replace with 'replacementText' instead
string3=>                 # replace with the empty string
CodeWizard
  • 128,036
  • 21
  • 144
  • 167
  • Hmmm.. this will clean my git repo, but if I then push to github, won't it just add another (clean) commit and leave the password in teh github history? – doctorer Jan 21 '20 at 23:40
  • Using BFG will remove it from the **entire** history but you will have to force push to overwrite current content – CodeWizard Jan 21 '20 at 23:42
  • Have installed bfg but `bfg --replace-text passwords.txt` gives me a syntax error `bfg --replace-text passwords.txt ^ SyntaxError: invalid syntax` – doctorer Jan 22 '20 at 00:01
  • `SyntaxError` looks like Python error. I suspect you ran the command in a Python command line. Run it in an OS console, OS command interpreter. – phd Jan 22 '20 at 00:52
  • 1
    `git-filter-repo` provides a replacement based on their tool but mimicking BFG's syntax, but with some fixed bugs, called [bfg-ish](https://github.com/newren/git-filter-repo/blob/main/contrib/filter-repo-demos/bfg-ish). – gaborous Aug 18 '22 at 22:33