923

I'm looking for the string foo= in text files in a directory tree. It's on a common Linux machine, I have bash shell:

grep -ircl "foo=" *

In the directories are also many binary files which match "foo=". As these results are not relevant and slow down the search, I want grep to skip searching these files (mostly JPEG and PNG images). How would I do that?

I know there are the --exclude=PATTERN and --include=PATTERN options, but what is the pattern format? The man page of grep says:

--include=PATTERN     Recurse in directories only searching file matching PATTERN.
--exclude=PATTERN     Recurse in directories skip file matching PATTERN.

Searching on grep include, grep include exclude, grep exclude and variants did not find anything relevant

If there's a better way of grepping only in certain files, I'm all for it; moving the offending files is not an option. I can't search only certain directories (the directory structure is a big mess, with everything everywhere). Also, I can't install anything, so I have to do with common tools (like grep or the suggested find).

Benjamin Loison
  • 3,782
  • 4
  • 16
  • 33
Piskvor left the building
  • 91,498
  • 46
  • 177
  • 222
  • 14
    Just FYI, the arguments used: -c count the matches in file -i case-insensitive -l only show matching files -r recursive – Piskvor left the building Oct 21 '08 at 13:56
  • 73
    A quicker way to exclude svn dirs is `--exclude-dir=.svn`, so grep doesn't go into them at all – orip Dec 02 '09 at 10:14
  • 26
    A couple of pedantic points people may need to know: 1. Note the lack of quotes around the glob here: --exclude='*.{png,jpg}' doesn't work (at least with my GNU grep version) because grep doesn't support {} in its globs. The above is shell-expanded to '--exclude=*.png --exclude=*.jpg' (assuming no files match in the cwd - highly unlikely since you don't normally start filenames with '--exclude=') which grep likes just fine. 2. --exclude is a GNU extension and not part of POSIX's definition of grep, so if you write scripts using this be aware they won't necessarily run on non-GNU systems. – ijw Jan 20 '11 at 14:11
  • 2
    Full example of exclude-dir usage: `grep -r --exclude-dir=var "pattern" .` – Tisch Aug 13 '15 at 10:01

22 Answers22

938

Use the shell globbing syntax:

grep pattern -r --include=\*.cpp --include=\*.h rootdir

The syntax for --exclude is identical.

Note that the star is escaped with a backslash to prevent it from being expanded by the shell (quoting it, such as --include="*.cpp", would work just as well). Otherwise, if you had any files in the current working directory that matched the pattern, the command line would expand to something like grep pattern -r --include=foo.cpp --include=bar.cpp rootdir, which would only search files named foo.cpp and bar.cpp, which is quite likely not what you wanted.

Update 2021-03-04

I've edited the original answer to remove the use of brace expansion, which is a feature provided by several shells such as Bash and zsh to simplify patterns like this; but note that brace expansion is not POSIX shell-compliant.

The original example was:

grep pattern -r --include=\*.{cpp,h} rootdir

to search through all .cpp and .h files rooted in the directory rootdir.

Adam Rosenfield
  • 390,455
  • 97
  • 512
  • 589
  • 12
    I don't know why, but I had to quote the include pattern like this: `grep pattern -r --include="*.{cpp,h}" rootdir` – topek Dec 09 '11 at 07:41
  • 6
    @topek: Good point -- if you have any .cpp/.h files in your current directory, then the shell will expand the glob before invoking grep, so you'll end up with a command line like `grep pattern -r --include=foo.cpp --include=bar.h rootdir`, which will only search files named `foo.cpp` or `bar.h`. If you don't have any files that match the glob in the current directory, then the shell passes on the glob to grep, which interprets it correctly. – Adam Rosenfield Dec 14 '11 at 22:51
  • 7
    I just realized that the glob is used to only matching the filename. To exclude a whole directory one needs `--exclude-dir` option. Same rules apply though. Only directory filename is matched, not a path. – Krzysztof Jabłoński Sep 22 '15 at 17:00
  • 3
    `--include` doesn't seem to work after `--exclude`. I suppose it doesn't make sense to even try, except that I have an `alias` to grep with a long list of `--exclude` and `--exclude-dir`, which I use for searching code, ignoring libraries and swap files and things. I would've hoped that `grep -r --exclude='*.foo' --include='*.bar'` would work, so I could limit my `alias` to `--include='*.bar'` only, but it seems to ignore the `--include` and include everything that's not a .foo file. Swapping the order of the `--include` and `--exclude` works, but alas, that's not helpful with my `alias`. – Michael Scheper Aug 10 '16 at 13:49
  • 1
    how can we read someone's mind to get rules for this `PATTERN`. Half of hour I can't find any description of what are they waiting there for – Arkady Aug 09 '18 at 08:22
  • For anyone sharing my frustration: single-item globs like `{sql}` don't match anything; you have to at least add a comma: `{sql,}`. – WoodrowShigeru Aug 15 '19 at 07:38
  • `{cpp,h}` is not a valid GLOB, how come this is accepted answer? ` `A pattern can use *, ?, and [...] as wildcards, and \ to quote a wildcard or backslash character literally` – Jakub Bochenski Mar 04 '21 at 13:47
  • @JakubBochenski That's a shell feature called *brace expansion* (https://www.gnu.org/software/bash/manual/bash.html#Brace-Expansion). In this case, it's shorthand for `--include=\*.cpp --include=\*.h`. – Adam Rosenfield Mar 04 '21 at 14:34
  • @AdamRosenfield it's not a shell feature, it's an extension of some shells like `bash` and will not work in plain POSIX shell – Jakub Bochenski Mar 04 '21 at 14:46
  • @JakubBochenski: Thanks, I wasn't aware that brace expansion wasn't a POSIX shell-compatible feature. I've edited the original answer for clarity. – Adam Rosenfield Mar 04 '21 at 19:29
  • @WoodrowSigeru That fixes a symptom, but creates new problems. The proper fix is just to take out the braces if you only have one pattern. – tripleee Mar 04 '21 at 19:32
236

If you just want to skip binary files, I suggest you look at the -I (upper case i) option. It ignores binary files. I regularly use the following command:

grep -rI --exclude-dir="\.svn" "pattern" *

It searches recursively, ignores binary files, and doesn't look inside Subversion hidden folders, for whatever pattern I want. I have it aliased as "grepsvn" on my box at work.

KeithWM
  • 1,295
  • 10
  • 19
rmeador
  • 25,504
  • 18
  • 62
  • 103
  • 27
    `--exclude-dir` is not available everywhere. my RH box at work with GNU grep 2.5.1 does not have it. – gcb May 18 '12 at 18:23
  • Any suggestions for what to use when `--exclude-dir` is unavailable? In all my attemps, `--exclude` does not appear to fit the bill. – JMTyler Mar 31 '14 at 18:05
  • 1
    You can always download the latest grep source from GNU, and do a 'configure; make; sudo make install'. This is one of the first things I do on a Mac or older Linunx distribution. – Jonathan Hartley Apr 25 '14 at 09:50
  • 3
    Exactly what I needed. Actually, I use git. So, `--exclude-dir="\.git"`. :-) – Ionică Bizău Jul 02 '14 at 19:13
  • @IonicăBizău git has a grep wrapper which searches only files that are indexed in your repository: https://git-scm.com/docs/git-grep – Woodrow Barlow Feb 16 '17 at 18:19
73

Please take a look at ack, which is designed for exactly these situations. Your example of

grep -ircl --exclude=*.{png,jpg} "foo=" *

is done with ack as

ack -icl "foo="

because ack never looks in binary files by default, and -r is on by default. And if you want only CPP and H files, then just do

ack -icl --cpp "foo="
Andy Lester
  • 91,102
  • 13
  • 100
  • 152
  • Looks nice, will try the standalone Perl version next time, thanks. – Piskvor left the building Oct 21 '08 at 14:39
  • 5
    Good call, I can no longer live without ack. – Chance Nov 15 '10 at 21:21
  • 1
    http://stackoverflow.com/questions/667471/how-do-i-run-programs-with-strawberry-perl - This will allow you to get ack on windows, if that is where you are running grep from. – TamusJRoyce Dec 15 '10 at 16:58
  • @Chance Maybe you want [silversearcher-ag](https://github.com/ggreer/the_silver_searcher), just `apt-get` in Ubuntu :) – Justme0 Apr 18 '16 at 08:09
  • 1
    Ripgrep can also do this - ignores binary and git ignored files by default. To exclude a filetype, you use `rg --type-not cpp`, to search only for a filetype you use `rg --type cpp`. You can download just a single executable and run it. – user31389 May 05 '20 at 11:26
37

grep 2.5.3 introduced the --exclude-dir parameter which will work the way you want.

grep -rI --exclude-dir=\.svn PATTERN .

You can also set an environment variable: GREP_OPTIONS="--exclude-dir=\.svn"

I'll second Andy's vote for ack though, it's the best.

Lii
  • 11,553
  • 8
  • 64
  • 88
Corey
  • 579
  • 5
  • 5
  • 7
    +1 for mentioning the exact version number; I have grep 2.5.1 and exclude-dir option is not available – James Jul 25 '11 at 21:25
32

I found this after a long time, you can add multiple includes and excludes like:

grep "z-index" . --include=*.js --exclude=*js/lib/* --exclude=*.min.js
Rushabh Mehta
  • 1,463
  • 16
  • 15
13

The suggested command:

grep -Ir --exclude="*\.svn*" "pattern" *

is conceptually wrong, because --exclude works on the basename. Put in other words, it will skip only the .svn in the current directory.

  • 3
    Yep, it doesn't work at all for me. The one that worked for me was: exclude-dir=.svn – Taryn East Jan 31 '11 at 18:07
  • 2
    @Nicola thank you! I've been tearing my hair out about why this won't work. Tell me, is there a way to discover this from the manpage? All it says is it matches "PATTERN". *EDIT* manpage says "file", as explained here http://fixunix.com/unix/89433-why-does-%60grep-exclude%3D-svn%60-not-work.html#post293985 – 13ren Jun 29 '11 at 16:40
11

In grep 2.5.1 you have to add this line to ~/.bashrc or ~/.bash profile

export GREP_OPTIONS="--exclude=\*.svn\*"
deric
  • 172
  • 1
  • 2
8

I find grepping grep's output to be very helpful sometimes:

grep -rn "foo=" . | grep -v "Binary file"

Though, that doesn't actually stop it from searching the binary files.

Aaron Maenpaa
  • 119,832
  • 11
  • 95
  • 108
7

On CentOS 6.6/Grep 2.6.3, I have to use it like this:

grep "term" -Hnir --include \*.php --exclude-dir "*excluded_dir*"

Notice the lack of equal signs "=" (otherwise --include, --exclude, include-dir and --exclude-dir are ignored)

aesede
  • 5,541
  • 2
  • 35
  • 33
7

If you are not averse to using find, I like its -prune feature:

find [directory] \
        -name "pattern_to_exclude" -prune \
     -o -name "another_pattern_to_exclude" -prune \
     -o -name "pattern_to_INCLUDE" -print0 \
| xargs -0 -I FILENAME grep -IR "pattern" FILENAME

On the first line, you specify the directory you want to search. . (current directory) is a valid path, for example.

On the 2nd and 3rd lines, use "*.png", "*.gif", "*.jpg", and so forth. Use as many of these -o -name "..." -prune constructs as you have patterns.

On the 4th line, you need another -o (it specifies "or" to find), the patterns you DO want, and you need either a -print or -print0 at the end of it. If you just want "everything else" that remains after pruning the *.gif, *.png, etc. images, then use -o -print0 and you're done with the 4th line.

Finally, on the 5th line is the pipe to xargs which takes each of those resulting files and stores them in a variable FILENAME. It then passes grep the -IR flags, the "pattern", and then FILENAME is expanded by xargs to become that list of filenames found by find.

For your particular question, the statement may look something like:

find . \
     -name "*.png" -prune \
     -o -name "*.gif" -prune \
     -o -name "*.svn" -prune \
     -o -print0 | xargs -0 -I FILES grep -IR "foo=" FILES

OnlineCop
  • 4,019
  • 23
  • 35
  • One amendment I'd suggest: include `-false` immediately after each `-prune` so forgetting to use `-print0` or some kind of `exec` command won't actually print the files you wanted to exclude: `-name "*.png" -prune -false -o name "*.gif -prune -false` ... – OnlineCop Nov 20 '19 at 15:40
6

git grep

Use git grep which is optimized for performance and aims to search through certain files.

By default it ignores binary files and it is honoring your .gitignore. If you're not working with Git structure, you can still use it by passing --no-index.

Example syntax:

git grep --no-index "some_pattern"

For more examples, see:

kenorb
  • 155,785
  • 88
  • 678
  • 743
5

I'm a dilettante, granted, but here's how my ~/.bash_profile looks:

export GREP_OPTIONS="-orl --exclude-dir=.svn --exclude-dir=.cache --color=auto" GREP_COLOR='1;32'

Note that to exclude two directories, I had to use --exclude-dir twice.

Benjamin Loison
  • 3,782
  • 4
  • 16
  • 33
4D4M
  • 59
  • 1
  • 2
  • Necro comment from the distant dead .... GREP_OPTIONS is now deprecated, so I don't think these answers using that are valid anymore. Hey, I know it's late, but this is news to me. :) – TonyG Oct 30 '20 at 18:54
4

find and xargs are your friends. Use them to filter the file list rather than grep's --exclude

Try something like

find . -not -name '*.png' -o -type f -print | xargs grep -icl "foo="

The advantage of getting used to this, is that it is expandable to other use cases, for example to count the lines in all non-png files:

find . -not -name '*.png' -o -type f -print | xargs wc -l

To remove all non-png files:

find . -not -name '*.png' -o -type f -print | xargs rm

etc.

As pointed out in the comments, if some files may have spaces in their names, use -print0 and xargs -0 instead.

Andrew Stein
  • 12,880
  • 5
  • 35
  • 43
  • 1
    This doesn't work on filenames with spaces, but that problem is easily solved by using print0 instead of print and adding the -0 option to xargs. – Adam Rosenfield Oct 21 '08 at 13:46
4

If you search non-recursively you can use glop patterns to match the filenames.

grep "foo" *.{html,txt}

includes html and txt. It searches in the current directory only.

To search in the subdirectories:

grep "foo" */*.{html,txt}

In the subsubdirectories:

grep "foo" */*/*.{html,txt}
Benjamin Loison
  • 3,782
  • 4
  • 16
  • 33
Stéphane Laurent
  • 75,186
  • 15
  • 119
  • 225
4

In the directories are also many binary files. I can't search only certain directories (the directory structure is a big mess). Is there's a better way of grepping only in certain files?

ripgrep

This is one of the quickest tools designed to recursively search your current directory. It is written in Rust, built on top of Rust's regex engine for maximum efficiency. Check the detailed analysis here.

So you can just run:

rg "some_pattern"

It respect your .gitignore and automatically skip hidden files/directories and binary files.

You can still customize include or exclude files and directories using -g/--glob. Globbing rules match .gitignore globs. Check man rg for help.

For more examples, see: How to exclude some files not matching certain extensions with grep?

On macOS, you can install via brew install ripgrep.

kenorb
  • 155,785
  • 88
  • 678
  • 743
3

Try this one:

$ find . -name "*.txt" -type f -print | xargs file | grep "foo=" | cut -d: -f1

Founded here: http://www.unix.com/shell-programming-scripting/42573-search-files-excluding-binary-files.html

Benjamin Loison
  • 3,782
  • 4
  • 16
  • 33
Gravstar
  • 1,071
  • 6
  • 10
  • 3
    This doesn't work on filenames with spaces, but that problem is easily solved by using print0 instead of print and adding the -0 option to xargs. – Adam Rosenfield Oct 21 '08 at 13:58
2

those scripts don't accomplish all the problem...Try this better:

du -ha | grep -i -o "\./.*" | grep -v "\.svn\|another_file\|another_folder" | xargs grep -i -n "$1"

this script is so better, because it uses "real" regular expressions to avoid directories from search. just separate folder or file names with "\|" on the grep -v

enjoy it! found on my linux shell! XD

2

Look @ this one.

grep --exclude="*\.svn*" -rn "foo=" * | grep -v Binary | grep -v tags
animuson
  • 53,861
  • 28
  • 137
  • 147
  • 2
    Things that achieve approximately this have been covered in other posts; what's more, this is wrong, in that with various layout options set it will mess up line numbers and things like that or exclude lines of context which were desired. – Chris Morgan Nov 10 '10 at 05:38
1

The --binary-files=without-match option to GNU grep gets it to skip binary files. (Equivalent to the -I switch mentioned elsewhere.)

(This might require a recent version of grep; 2.5.3 has it, at least.)

mjs
  • 63,493
  • 27
  • 91
  • 122
1

suitable for tcsh .alias file:

alias gisrc 'grep -I -r -i --exclude="*\.svn*" --include="*\."{mm,m,h,cc,c} \!* *'

Took me a while to figure out that the {mm,m,h,cc,c} portion should NOT be inside quotes. ~Keith

Keith Knauber
  • 752
  • 6
  • 13
-1

To ignore all binary results from grep

grep -Ri "pattern" * | awk '{if($1 != "Binary") print $0}'

The awk part will filter out all the Binary file foo matches lines

Nakilon
  • 34,866
  • 14
  • 107
  • 142
lathomas64
  • 1,612
  • 5
  • 21
  • 47
-3

Try this:

  1. Create a folder named "--F" under currdir ..(or link another folder there renamed to "--F" ie double-minus-F.
  2. #> grep -i --exclude-dir="\-\-F" "pattern" *
jacoz
  • 3,508
  • 5
  • 26
  • 42