1105

I have a code base which I want to push to GitHub as open source. In this Git-controlled source tree, I have certain configuration files which contain passwords. I made sure not to track this file and I also added it to the .gitignore file. However, I want to be absolutely positive that no sensitive information is going to be pushed, perhaps if something slipped in-between commits or something. I doubt I was careless enough to do this, but I want to be positive.

Is there a way to "grep" all of Git? I know that sounds weird, but by "all" I mean every version of every file that ever existed. I guess if there is a command that dumps the diff file for every commit, that might work?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Jorge Israel Peña
  • 36,800
  • 16
  • 93
  • 123
  • it's limited in that it'll only search a single branch (master?), but it's pretty close to what you want https://github.com/divinity76/SearchGithubHistory.js/blob/master/SearchGithubHistory.js / – hanshenrik Jan 23 '15 at 13:12
  • 1
    Notwithstanding the 'Correct Answers', your requirement is to check that certain information is not committed publicly - the 'git' answer is only relevant since you are committing the whole history. Of course if you only commit the current revision, without history (use eg. "git archive"), then a simple 'grep' will suffice. – MikeW Jul 20 '16 at 12:28
  • 6
    not a duplicate. the other question is about just the logs, this one is about *all* of a git history. those are different. – worc Dec 18 '18 at 21:46

3 Answers3

1723

Git can search diffs with the -S option (it's called pickaxe in the docs)

git log -S password

This will find any commit that added or removed the string password. Here a few options:

  • -p: will show the diffs. If you provide a file (-p file), it will generate a patch for you.
  • -G: looks for differences whose added or removed line matches the given regexp, as opposed to -S, which "looks for differences that introduce or remove an instance of string".
  • --all: searches over all branches and tags; alternatively, use --branches[=<pattern>] or --tags[=<pattern>]
djvg
  • 11,722
  • 5
  • 72
  • 103
Nathan Kinsinger
  • 23,641
  • 2
  • 30
  • 19
  • 3
    If something does wind up committed, is there an easy way to remove it? Let's assume in this scenario there's a config file that I want to keep, but one line contains a password, which I want to remove from all of my git history. Any simple way to do that without rewriting every commit? – Matt D Jan 28 '13 at 01:12
  • 2
    @MattD Yes, `git rebase -i ` will do the trick. Relevant question: http://stackoverflow.com/questions/4963261/can-i-rebase-old-commits – Erik B Jan 30 '13 at 11:52
  • 3
    hi, `git log -Gpassword --all`, how to add condition to only search for some file(giving a regex to filter filename+filepath) – atian25 Aug 22 '13 at 01:02
  • 1
    @MattD "without rewriting every commit" - No. You'll have to re-write every commit after the one introducing the password. This tool is the easiest way to clean the repo: http://rtyley.github.io/bfg-repo-cleaner/ – adamnfish Mar 13 '14 at 11:43
  • 29
    In this particular case I'd also throw in a `-i` to make the search case insensitive. – dain Oct 15 '15 at 10:58
  • Note the lack of space between the ```-S``` and ```password```. I had seen other advice that included a space and including one would result in a ```fatal: ambiguous argument```. – Paul Calabro Nov 04 '15 at 17:59
  • 12
    Just an FYI, the above command didn't really work for me. I did the following: `git log -p -S ` I stole this info from [this informative article about git pickaxe.](http://www.philandstuff.com/2014/02/09/git-pickaxe.html) – AlbertEngelB Feb 18 '16 at 20:56
  • 2
    Rolled back the edit by Geoffrey Hale The `-S` option does search diffs. Adding space after `-S` changes the meaning of the argument from search term to "revision or path". – Andomar Jul 06 '16 at 11:03
  • 25
    I don't know if this is new, but the linked docs says that `-S` looks for "differences that change **the number of occurrences** of the specified string" (emphasis added.) So if a commit adds the term you're seeking but also removes it from elsewhere, `-S` will not find it. `-G`, OTOH, doesn't do this. – shawkinaw Mar 17 '17 at 16:42
  • 9
    Thanks! Because this is such a useful reference, I'd add that `-- path/filename` will restrict the search to a file. – ptim Mar 24 '17 at 04:42
  • 1
    you might also want to add `--patch` so that you can actually see the code changes: `git log -Sword --patch` – ccpizza Feb 06 '20 at 13:27
  • This will not find passwords that were commited as part of the username or email address. – Thomas Weller Mar 21 '22 at 12:25
135
git rev-list --all | (
    while read revision; do
        git grep -F 'password' $revision
    done
)
cdhowie
  • 158,093
  • 24
  • 286
  • 300
  • 14
    +1: I would have done "for revision in \`git rev-list --all\`; do git grep… done", but your approach is more reactive, as it greps while the revisions are being found. – Eric O. Lebigot Dec 17 '10 at 09:01
  • 2
    Is it possible to use this on a remote repository (like github)? – studgeek Mar 16 '11 at 00:32
  • 2
    @reesd: Only if you clone it, of course. – cdhowie Mar 22 '11 at 06:01
  • In order to avoid seeing matches from `vendor/cache/` and `public/assets/`, change the `grep` line in this answer to: `git grep -F 'password' $revision | grep -v ':vendor/cache/' | grep -v ':public/assets/'` – user664833 Jan 20 '12 at 18:47
  • You can get file names only (without commit hash) Also sorted and without duplicates. Check my answer for this. Thanks to this answer's OP from whom I took inspiration. [Here is my answer](https://stackoverflow.com/a/69714869/10830091) – om-ha Oct 25 '21 at 22:09
  • This will not find passwords that were commited as part of the username or email address. – Thomas Weller Mar 21 '22 at 12:26
68

Try the following commands to search the string inside all previous tracked files:

git log --patch  | less +/searching_string

or

git rev-list --all | GIT_PAGER=cat xargs git grep 'search_string'

which needs to be run from the parent directory where you'd like to do the searching.

kenorb
  • 155,785
  • 88
  • 678
  • 743