0

At some point in the past, sensitive info was uploaded to a git repo. We removed this in the following commit and changed the info, but we'd still like to remove the original info from the git history, while preserving the commits made since then if possible.

I have been experimenting with the BFG repo cleaner and have discovered that the help command lists a "-p --protect-blobs-from " option not mentioned on their site. At first glance this seems to do exactly what we want, but the actual behaviour seems a bit bizarre: the file is removed from all commits older than the specified blob, as expected, but it's also removed from all commits newer than the specified commit. i.e: on the specified commit the file shows as a new file, with all the lines it had at that point, however in the next commit the file shows as being deleted. This means that the file is deleted from HEAD

Is there a way, using BFG or another method, to delete all history of a file from a certain commit while preserving everything since?

Edit for clarity: we have already removed all cases of sensitive info from our repo manually: in some cases that involved deleting whole files, in others we just changed a few lines. But the original files (with now-outdated) passwords can still be seen in the commit history so we're looking to remove this history, preferably without losing the record of changes we've made since then (in some cases, the passwords were removed many months ago and there have been dozens of commits since). I know about the BFG -D option that will delete the entire history, but if possible we'd like to keep the commits that have been made since we removed the passwords

Thanks

  • 1
    I don't know about the BFG, but it is possible (but hard) to do with filter-branch. Note that the result is an *all new repository*. You must throw out all copies of the old repository and switch everyone to the new repository. This is usually very painful, enough so that most people don't do it, regardless of how easy or hard it might be to achieve. – torek Jul 05 '21 at 14:44
  • 1
    You are not saying whether this sensitive data was part of a larger file or you want to delete the whole file. If it is the whole file, then using BFG's -D option should suffice. However if it is part of a file then the -rt option, where you give it a list of data to replace with **REMOVED** in the history is generally best - although I've not tried with multiline. The tool though assumes you've just fixed the issue, so -D will only try and preserve latest version if it comes to it. – johnfo Jul 05 '21 at 15:04
  • I'm not sure if you are attached to the history of all other files, but you could just pick the files you need (without the sensitive info), create a new repo and them delete the old one. – Leonardo Alves Machado Jul 05 '21 at 15:10
  • Maybe I wasn't as clear as I could have been in my original question: we have already removed all cases of sensitive info from our repo manually: in some cases that involved deleting whole files, in others we just changed a few lines. By the original files (with now-outdated) passwords can still be seen in the commit history, so we're looking to remove this history, preferably without losing the record of changes we've made since then. I know about the -D option that will delete the entire history, but if possible we'd like to keep the commits that have been made since we removed the passwords – Bipolarbear54 Jul 06 '21 at 08:52
  • Does this answer your question? [Remove sensitive files and their commits from Git history](https://stackoverflow.com/questions/872565/remove-sensitive-files-and-their-commits-from-git-history) – Ron van der Heijden Sep 08 '22 at 10:22

1 Answers1

0

From the command line help: bfg --help:

--protect-blobs-from "\<refs>"

where "\<refs>" can be represented as ref1,ref2,ref3,...,refn (separated by comma).

Jeremy Caney
  • 7,102
  • 69
  • 48
  • 77