0

I have a list of bitbucket repositories on a server:

[user@lonapdbitbucket1 repositories]$ ls
1039 1044 1059 2165 2656 3958 3958 9284 9274 8274 7264 7263 8274

If I cd into one of these repositories and run git grep, to search for Ansible encryption strings, then it works fine - git grep manages to find an Ansible encryption string:

[user@lonapdbitbucket1 repositories]$ cd 1044 
[user@lonapdbitbucket1 repositories]$ git grep -P '\$ANSIBLE_VAULT;[0-9]\.[0-];AES256' $(git rev-list --all)

To do this across multiple repos, I thought to convert it into a bash script:

# secret_scan.sh
repos_root=/var/lib/docker/volumes/bitbucket/_data/shared/data/repositories
git_grep_cmd=git grep -P '\$ANSIBLE_VAULT;[0-9]\.[0-];AES256' $(git rev-list --all)
for dir in ./*
do
    # below line is just to clean up the directory string
    repo_dir="$(d{dir#./}"
    cd "${repos_root}${repo_dir}"; \
    eval "git_grep_cmd"
done

Unfortunately, this does not work:

[user@lonapdbitbucket1 repositories]$ ./secret_scan.sh
fatal: not a git repository (or any parent up to mount point /var/lib)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
fatal: this operation must be run in a work tree
fatal: this operation must be run in a work tree
fatal: this operation must be run in a work tree
fatal: this operation must be run in a work tree
fatal: this operation must be run in a work tree
fatal: this operation must be run in a work tree
fatal: this operation must be run in a work tree
fatal: this operation must be run in a work tree
fatal: this operation must be run in a work tree
fatal: this operation must be run in a work tree
fatal: this operation must be run in a work tree
[user@lonapdbitbucket1 repositories]$ _

Would anyone be able to suggest a solution here, to essentially cd into multiple repositories and then run git grep on each, replicating results as if i were doing it on the command line?

cokeburger
  • 17
  • 2
  • 1
    Put a valid [shebang](https://en.wikipedia.org/wiki/Shebang_(Unix)) and paste your script at https://shellcheck.net for validation/recommendation. – Jetchisel Jan 12 '23 at 18:12
  • 3
    Don't store commands in variables. Variables are for data, not executable code. If you need to store executable code, use a function (or maybe an array), but in this case I'd just skip storing it. See [BashFAQ #50: "I'm trying to put a command in a variable, but the complex cases always fail!"](http://mywiki.wooledge.org/BashFAQ/050) BTW, the immediate problem is that your syntax for storing the command is all wrong, you need another layer of quoting/escaping, but fixing that is messy. Just don't. – Gordon Davisson Jan 12 '23 at 18:23

1 Answers1

0

I'm not sure what you are trying to achieve with eval and repo_dir. This should be as simple as:

repos_root=/var/lib/docker/volumes/bitbucket/_data/shared/data/repositories
for dir in *
do
    cd "$repos_root$dir";
    git grep -P '\$ANSIBLE_VAULT;[0-9]\.[0-];AES256' $(git rev-list --all)
done

Since your repos_root is absolute, you don't need to take care of returning to the original directory.

But I'm sceptical that git rev-list --all can be substituted, its output will be huge. Are you trying to find the string in ALL commits? To search the full history for a string, check Search all of Git history for a string and How to grep Git commit diffs or contents for a certain word

knittl
  • 246,190
  • 53
  • 318
  • 364