4

I have two submodules in a main repository that are tightly coupled with the code in the main repository. So much so that when performing a git grep, I'd like the main repo to be grepped along with those two specific submodules. I can't do a full git submodule foreach "git grep ... || true", because I have other submodules that are very large, and grepping through those submodules can take up to 15-20 seconds for each of them.

So, I need a more specific solution than git submodule foreach, that allows me to specify the specific submodules to grep though, along with grepping my main repo.

My two submodules are foo and submodules/bar

Here's my current solution. This allows me to say git gpx -i something.*interesting, etc. I'm still new at creating git aliases, so I'm hoping this solution can be improved.

[alias]
    inr = "!f() { cd $1 && $2 ;}; f"
    gprs = "!f() { for r in $1; do git inr $r \"git grep $2\" | perl -pe \"s|^|$r/|\"; done ;}; f"
    gpx = "!f() { git grep $*; git gprs 'foo submodules/bar' \"$*\" ;}; f"

EDIT: One problem I discovered is that I lose quotes around the regex, so I cannot call something like this: git gpx -i "foo bar", because it gets translated to this: git grep -i foo bar. I can't think of a clean way to avoid this.

Clayton Stanley
  • 7,513
  • 9
  • 32
  • 46

1 Answers1

2

REDO: I had hacked together a bash script with that script to do the paging (mentioned at the bottom), but noticed that your example already prefixed the search path. I modified your alias a little bit and now find that this works for me:

[alias]
    gpx = "!f() { list=$1; shift; for r in . $list; do ( cd $r; git grep $@ | perl -pe \"s|^|$r/|\"; ); done ;}; f"

If you use a subshell then you can fold in the "inr" functionality "( cd $1; cmd; ... )" (since it isolates the chdir), and if you add a "." then it will search the supermodule as well. I tested it out with something like

git --no-pager gpx 'foo submodules/bar' --color=always -nIE '\w+\(\w*\)' | less -R

It seems to work well with escaping (was going to use git-rev-parse --sq-quote but seems like git handles that for you already). This seems more elegant than the script stuff listed below, and maybe I could use your style of aliases with the function prefix and replace / simplify a lot of that functionality. That being said, thanks for showing that!

Tried to make another alias that use gpx for the pager stuff, but that got a little hairy with escaping, so I just made another alias as well:

[alias]
   gpxp = "!f() { list=$1; shift; for r in . $list; do ( cd $r; git --no-pager grep --color=always $@ | perl -pe \"s|^|$r/|\"; ); done | less -R ;}; f"

Then it becomes

git gpxp 'foo submodules/bar' -nIE '\w+\(\w*\)'

NOTE: If you get errors, those will show up in the grep text results.

ORIGINAL:

I had run into similar problems with escaping and using git aliases.

I've been writing up a small extension to git-submodule which allows you to do constrained iteration. The help for the feature is in git-submodule-ext foreach [-c | --constrain] (help, implementation). The install instructions are here: README

If you wish to constrain iteration to foo and something/bar, in your supermodule you can do

git config scm.focusGroup 'foo something/bar'

Then to do your greppage

git --no-pager submodule-ext foreach -t -r -c -k git grep 'expression'

Or if you install with aliases,

git --no-pager tsfer -c -k git grep 'expression'

The git --no-pager option is to prevent each submodule from pulling up $GIT_PAGER after each search. I added a --keep-going option to the script so that if grep returned nothing (resulting in non-zero status), it doesn't stop iterating. The other solution is to do the example in submodule docs, using git tsfer -c -k 'git grep "expression" || :', which work equivalently.

If your expression is complex, something like looking for function calls '\w+\(\w*\)', you will need to enclose the entire expression in double quotes:

git --no-pager tsferp -c -k "git grep -E '\w+\(\w*\)'"

If escaping is a huge issue and you're using bash, you export a function to use in the iteration (the extension script I modified is using bash, rather than Git's standard /bin/sh)

greppage() {
    git grep -E 'some(really)?complex.*\(expression\)'
}
export -f greppage
git --no-pager tsfer -c -k greppage

Hope that helps.

NOTE: A current drawback to this is that it might be hard to figure out which submodule the match was located in. A fix could be made by somehow prepending the $name of the submodule, but I looked the git grep's options and could not find anything like that. I tinkered with the command a little bit and came up with this, uhh, 'compact', command:

git --no-pager sube -q foreach -t -r -c -k "echo [ \$name ]'\n'; git grep --color=always -E '\w+\(\w*\)'; echo '\n\n'" | less -R

This make the foreach quiet (suppress the 'Entering' output), and add some braces and newlines to make it easier to see divisions among submodules. Lemme see if I can make a function / alias for this to make it easier.

EDIT: Here's the hacked together script, which is not as elegant. I just made it a bash function to simplify things

git-greps() { git --no-pager sube -q foreach -t -r -c -k "git grep --color=always $(git rev-parse --sq-quote "$@") | perl -pe \"s|^|\$name/|\"" | less -R; }

Example

git-greps -nIE '\w+\(\w*\)'
eacousineau
  • 3,457
  • 3
  • 34
  • 37