I'm trying to remove sensitive data like passwords from my Git history. Instead of deleting whole files I just want to substitute the passwords with removedSensitiveInfo
. This is what I came up with after browsing through numerous StackOverflow topics and other sites.
git filter-branch --tree-filter "find . -type f -exec sed -Ei '' -e 's/(aSecretPassword1|aSecretPassword2|aSecretPassword3)/removedSensitiveInfo/g' {} \;"
When I run this command it seems to be rewriting the history (it shows the commits it's rewriting and takes a few minutes). However, when I check to see if all sensitive data has indeed been removed it turns out it's still there.
For reference this is how I do the check
git grep aSecretPassword1 $(git rev-list --all)
Which shows me all the hundreds of commits that match the search query. Nothing has been substituted.
Any idea what's going on here?
I double checked the regular expression I'm using which seems to be correct. I'm not sure what else to check for or how to properly debug this as my Git knowledge quite rudimentary. For example I don't know how to test whether 1) my regular expression isn't matching anything, 2) sed isn't being run on all files, 3) the file changes are not being saved, or 4) something else.
Any help is very much appreciated.
P.S. I'm aware of several StackOverflow threads about this topic. However, I couldn't find one that is about substituting words (rather than deleting files) in all (ASCII) files (rather than specifying a specific file or file type). Not sure whether that should make a difference, but all suggested solutions haven't worked for me.