25

I have one of my passwords commited in probably few files in my Git repo. Is there some way to replace this password with some other string in whole history automatically so that there is no trace of it? Ideally if I could write simple bash script receiving strings to find and replace by and doing whole work itself, something like:

./replaceStringInWholeGitHistory.sh "my_password" "xxxxxxxx"

Edit: this question is not a duplicate of that one, because I am asking about replacing strings without removing whole files.

Karol Selak
  • 4,248
  • 6
  • 35
  • 65
  • It can be done. Have you publish you repo on a remote server (github, gitlab, other...) ? Are there other person that work with it ? – Techniv Oct 26 '17 at 09:42
  • Possible duplicate of [Remove sensitive files and their commits from Git history](https://stackoverflow.com/questions/872565/remove-sensitive-files-and-their-commits-from-git-history) – kowsky Oct 26 '17 at 09:44
  • To be strict, this is our company account, few people has access to it, and we use internal GitHub repo on own server. But in general, every person having access to repo is trusted for now. – Karol Selak Oct 26 '17 at 09:46

3 Answers3

23

git filter-repo --replace-text

Git 2.25 man git-filter-branch already clearly recommends using git filter-repo instead of git filter-tree, so here we go.

Install https://superuser.com/questions/1563034/how-do-you-install-git-filter-repo/1589985#1589985

python3 -m pip install --user git-filter-repo

and then use:

echo 'my_password==>xxxxxxxx' > replace.txt
git filter-repo --replace-text replace.txt

or equivalent with Bash magic:

git filter-repo --replace-text <(echo 'my_password==>xxxxxxxx')

Tested with this simple test repository: https://github.com/cirosantilli/test-git-filter-repository and replacement strings:

d1==>asdf
d2==>qwer

The above acts on all branches by default (so invasive!!!), to act only on selected branches use: git filter-repo: can it be used on a specific branch? e.g.:

--refs HEAD
--refs refs/heads/master

and only to act on a specified commit range you can: How to modify only a range of commits with git filter-repo instead of the entire branch history?

--refs HEAD~2..master
--refs HEAD~2..HEAD

The option --replace-text option is documented at: https://github.com/newren/git-filter-repo/blob/7b3e714b94a6e5b9f478cb981c7f560ef3f36506/Documentation/git-filter-repo.txt#L155

--replace-text <expressions_file>::

A file with expressions that, if found, will be replaced. By default, each expression is treated as literal text, but regex: and glob: prefixes are supported. You can end the line with ==> and some replacement text to choose a replacement choice other than the default of ***REMOVED***.

How to replace in a single file: git-filter-repo replace text by expression in a single file

Of course, once you've pushed a password publicly, it is always too late, and you will have to change the password, so I wouldn't even bother with the replace in this case: Remove sensitive files and their commits from Git history

Related: How to substitute text from files in git history?

Tested on git-filter-repo ac039ecc095d.

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
17

First, find all the files that could contain the password. Suppose the password is abc123 and the branch is master. You may need to exclude those files which have abc123 only as a normal string.

git log -S "abc123" master --name-only --pretty=format: | sort -u

Then replace "abc123" with "******". Suppose one of the files is foo/bar.txt.

git filter-branch --tree-filter "if [ -f foo/bar.txt ];then sed -i s/abc123/******/g foo/bar.txt;fi"

Finally, force push master to the remote repository if it exists.

git push origin -f master:master

I made a simple test and it worked but I'm not sure if it's okay with your case. You need to deal with all the files from all branches. As to the tags, you may have to delete all the old ones, and create new ones.

ElpieKay
  • 27,194
  • 6
  • 32
  • 53
  • Hm, okay, it works for actual branch, but having more I'll probably need to do that for every of them. – Karol Selak Oct 26 '17 at 10:40
  • I have problem with branches different than master. When I try `git log -S "abc123" test --name-only --pretty=format: | sort -u` I get error: `fatal: ambiguous argument 'test': both revision and filename`. Can I avoid it somehow? – Karol Selak Nov 06 '17 at 17:05
  • 2
    @KarolSelak the error says you have a ref named `test` and also a file named `test`. It's a naming conflict. If you expect Git to interpret `test` as a ref, then use `git log -S "abc123" test --name-only --pretty=format: -- | sort -u`. If interpreted as a file, then use `git log -S "abc123" --name-only --pretty=format: -- test | sort -u`. If you need both, then `git log -S "abc123" test --name-only --pretty=format: -- test | sort -u`. There are spaces around the `--`. See https://www.git-scm.com/docs/gitcli#_description for more. – ElpieKay Nov 06 '17 at 22:57
  • Thank you very much, finally I've written what I need, but that's mostly your merit. I hope that final solution will serve others for a long time :) – Karol Selak Nov 07 '17 at 12:47
  • @KarolSelak Glad it helps =). Don't forget to delete and recreate tags that you have pushed. They are still pointing to the old commits that may contain your password. – ElpieKay Nov 07 '17 at 13:08
  • Okay, thanks for important advice. Fortunately I haven't any tags in my repo, but I'll edit my answer to include it. – Karol Selak Nov 07 '17 at 13:14
4

At the beginning I'd like to thank ElpieKay, who posted core functions of my solutions, which I've only automatized.

So, finally I have script I wanted to have. I divided it into pieces which depend on each other and can serve as independent scripts. It looks like this:

censorStringsInWholeGitHistory.sh:

#!/bin/bash
#arguments are strings to censore

for string in "$@"
do
  echo ""
  echo "================ Censoring string "$string": ================"
  ~/replaceStringInWholeGitHistory.sh "$string" "********"
done

usage:

~/censorStringsInWholeGitHistory.sh "my_password1" "my_password2" "some_f_word"

replaceStringInWholeGitHistory.sh:

#!/bin/bash
# $1 - string to find
# $2 - string to replace with

for branch in $(git branch | cut -c 3-); do
  echo ""
  echo ">>> Replacing strings in branch $branch:"
  echo ""
  ~/replaceStringInBranch.sh "$branch" "$1" "$2"
done

usage:

~/replaceStringInWholeGitHistory.sh "my_password" "********"

replaceStringInBranch.sh:

#!/bin/bash
# $1 - branch
# $2 - string to find
# $3 - string to replace with

git checkout $1
for file in $(~/findFilesContainingStringInBranch.sh "$2"); do
  echo "          Filtering file $file:"
  ~/changeStringsInFileInCurrentBranch.sh "$file" "$2" "$3"
done

usage:

~/replaceStringInBranch.sh master "my_password" "********"

findFilesContainingStringInBranch.sh:

#!/bin/bash

# $1 - string to find
# $2 - branch name or nothing (current branch in that case)

git log -S "$1" $2 --name-only --pretty=format: -- | sort -u

usage:

~/findFilesContainingStringInBranch.sh "my_password" master

changeStringsInFileInCurrentBranch.sh:

#!/bin/bash

# $1 - file name
# $2 - string to find
# $3 - string to replace

git filter-branch -f --tree-filter "if [ -f $1 ];then sed -i s/$2/$3/g $1;fi"

usage:

~/changeStringsInFileInCurrentBranch.sh "abc.txt" "my_password" "********"

I have all those scripts located in my home folder, what is necessary for proper working in this version. I'm not sure that's the best option, but for now I cannot find better one. Of course every script has to be executable, what we can achieve with chmod +x ~/myscript.sh.

Probably my script is not optimal, for big repos it will process very long, but it works :)

And, at the very end, we can push our censored repo to any remote with:

git push <remote> -f --all

Edit: important hint from ElpieKay:

Don't forget to delete and recreate tags that you have pushed. They are still pointing to the old commits that may contain your password.

Maybe I'll improve my script in future to do this automatically.

Karol Selak
  • 4,248
  • 6
  • 35
  • 65
  • Do these scripts actually work? I couldn't get them to work: sed: -e expression #1, char 7: unterminated `s' command tree filter failed: – E. T. Apr 29 '20 at 21:42
  • Yes, I just checked and it works for me now. Although I use Git v2.17.1, I'm not sure what about newer versions. And I use Ubuntu. – Karol Selak Apr 29 '20 at 22:02
  • Is the problem by any chance that the sed string really should be escaped? I don't see how this could work if it contains spaces, forward slashes, or similar – E. T. Apr 29 '20 at 23:14
  • I don't know, my answer bases on ElpieKay's one (https://stackoverflow.com/a/46951323/3668967), so maybe he will be able to help you. – Karol Selak May 01 '20 at 15:27