26

I'm working on a Python program that makes heavy use of eggs (Plone). That means there are 198 directories full of Python code I might want to search through while debugging. Is there a good way to search only the .py files in only those directories, avoiding unrelated code and large binary files?

joeforker
  • 40,459
  • 37
  • 151
  • 246

12 Answers12

30
find DIRECTORY -name "*.py" | xargs grep PATTERN

By the way, since writing this, I have discovered ack, which is a much better solution.

(And since that edit, I have discovered ag).

Community
  • 1
  • 1
Steve B.
  • 55,454
  • 12
  • 93
  • 132
26
grep -r -n "PATTERN" --include="*.py" DIRECTORY
monowerker
  • 2,951
  • 1
  • 25
  • 23
19

I would strongly recommend ack, a grep substitute, "aimed at programmers with large trees of heterogeneous source code" (from the website)

Brian Agnew
  • 268,207
  • 37
  • 334
  • 440
7

I also use ack a lot these days. I did tweak it a bit to find all the relevant file types:

# Add zcml to the xml type:
--type-add
xml=.zcml

# Add more files the plone type:
--type-add
plone=.dtml,.zpt,.kss,.vpy,.props

# buildout config files
--type-set
buildout=.cfg

# Include our page templates to the html type so we can limit our search:
--type-add
html=.pt,.zpt

# Create txt file type:
--type-set
txt=.txt,.rst

# Define i18n file types:
--type-set
i18n=.pot,.po

# More options
--follow
--ignore-case
--nogroup

Important to remember is that ack won't find files if the extension isn't in its configuration. See "ack --help-types" for all the available types.

I also assume you are using omelette so you can grep/ack/find all the related files?

Mark van Lent
  • 12,641
  • 4
  • 30
  • 52
4

This problem was the motivation for the creation of collective.recipe.omelette. It is a buildout recipe which can symlink all the eggs from your working set into one directory structure, which you can point your favorite search utility at.

David Glick
  • 5,422
  • 17
  • 23
  • The grep-oriented answers are red herrings. They'll lead you to find multiple versions of files, all but one in un-used versions of the code (buildout may have fetched different egg releases over time). Use omelette, and grep the symlink structure which it generates. – Jean Jordaan Mar 14 '11 at 10:12
4
find <directory> -name '*.py' -exec grep <pattern> {} \;
phuclv
  • 37,963
  • 15
  • 156
  • 475
Mike
  • 412
  • 3
  • 7
  • 1
    This version is 26 times slower than the | xargs or standalone grep solution because it executes grep 16,836 times instead of once. – joeforker May 14 '09 at 20:31
  • 3
    but if you end it with a + instead of \;, then it's equivalent to the xargs solution, except doesn't break if your pathnames have spaces in them. – Marius Gedminas Sep 17 '09 at 20:56
2

There's also GNU idutils if you want to grep for identifiers in a large source tree very very quickly. It requires building a search database in advance, by running mkid (and tweaking its config file to not ignore .py files). z3c.recipe.tag takes care of that, if you use buildout.

Marius Gedminas
  • 11,010
  • 4
  • 41
  • 39
2

Just in case you want a non-commandline OSS solution...

I use pycharm. It has built in support for buildout. You point it at a buildout generated bin/instance and it sets the projects external dependencies to all the eggs used by the instance. Then all the IDE's introspection and code navigation work nicely. Goto definition, goto instances, refactoring support and of course search.

djay
  • 1,058
  • 5
  • 12
2

I recomend grin to search, omelette when working with plone and the pydev-feature 'Globals browser' (with eclipse or aptana studio).

pbauer
  • 635
  • 5
  • 8
  • Here are helpful scripts to import omelette + src folder to Eclipse: http://svn.plone.org/svn/collective/collective.eclipsescripts/trunk/README.txt – Mikko Ohtamaa Mar 25 '11 at 15:15
2

And simply because there are not enough answers...

If you're developing routinely, it's well worth the effort to install Eclipse with Pydev (or even easier, Aptana Studio - which is a modified Eclipse), in which case the find tools are right there.

Auspex
  • 2,175
  • 15
  • 35
  • Here is an a script which allows import buildout + omelette to Aptana: http://svn.plone.org/svn/collective/collective.eclipsescripts/trunk/README.txt – Mikko Ohtamaa Mar 25 '11 at 15:15
2

OpenGrok is an excellent choice for source searching and navigation. Runs on Java, though.

I really wish there was something like https://oracle.github.io/opengrok/

Suncatcher
  • 10,355
  • 10
  • 52
  • 90
Davi Lima
  • 800
  • 1
  • 6
  • 20
1

My grepping life is way more satisfying since discovering Emacs' rgrep command.

Say I want to find 'IPortletDataProvider' in Plone's source. I do:

  1. M-x rgrep
  2. Emacs prompts for the search string (IPortletDataProvider)
  3. ... then which files to search (*.py)
  4. ... then which directory (~/Plone/buildout-cache/eggs). If I'm already editing a file, this defaults to that file's directory, which is usually exactly what I want.

The results appear in a new buffer. At the top is the find | xargs grep command Emacs ran. All matches are highlighted. I can search the buffer using the standard text search commands. Best of all, I can hit Enter (or click) on a match to open that file.

It's a pretty nice way to work. I like that I don't have to remember find | xargs grep argument sequences, but that all that power is there if I need it.

Emacs rgrep example

Dan Jacka
  • 1,782
  • 1
  • 15
  • 25