How to grep for two words existing on the same line?

Question

How do I grep for lines that contain two input words on the line? I'm looking for lines that contain both words, how do I do that? I tried pipe like this:

grep -c "word1" | grep -r "word2" logs

It just stucks after the first pipe command.

Why?

Possible duplicate of [How to use grep to match string1 AND string2?](https://stackoverflow.com/questions/4487328/how-to-use-grep-to-match-with-multple-strings) — jww, Mar 13 '19 at 04:43

houbysoft · Accepted Answer · 2011-06-26T04:03:25.933

219

Why do you pass -c? That will just show the number of matches. Similarly, there is no reason to use -r. I suggest you read man grep.

To grep for 2 words existing on the same line, simply do:

grep "word1" FILE | grep "word2"

grep "word1" FILE will print all lines that have word1 in them from FILE, and then grep "word2" will print the lines that have word2 in them. Hence, if you combine these using a pipe, it will show lines containing both word1 and word2.

If you just want a count of how many lines had the 2 words on the same line, do:

grep "word1" FILE | grep -c "word2"

Also, to address your question why does it get stuck : in grep -c "word1", you did not specify a file. Therefore, grep expects input from stdin, which is why it seems to hang. You can press Ctrl+D to send an EOF (end-of-file) so that it quits.

edited Jun 26 '11 at 04:03

answered Jun 25 '11 at 21:39

houbysoft

32,532
24
103
156

64

When you're confused, the man pages are pretty much the last place you want to go for clarification. They're more confusing than randomly guessing. – corsiKa Jun 25 '11 at 21:45
8

@TotalFrickinRockstarFromMars: I disagree. It's true that in the beginning they might seem confusing, but once you get accustomed to the format using them is pretty straightforward. Anyway, I included it in the answer more for the "teach a man how to fish" bit, I expected the OP doesn't know them, and man pages can get pretty handy. – houbysoft Jun 25 '11 at 21:53
10

@houbysoft Then we'll have to agree to disagree. I've been using Linux and friends for the better part of 8 years, and I'd still rather google than use man pages. – corsiKa Jun 25 '11 at 22:10
@TotalFrickinRockstarFromMars: Well, I'm not denying the use of that. Anyway, could you point to some specific thing you find "confusing" in the grep man page, for example? – houbysoft Jun 25 '11 at 23:19
@houbysoft: what if I need to do a count – Jun 26 '11 at 03:58
@user157195: see edit, `grep "word1" FILE | grep -c "word2"`. – houbysoft Jun 26 '11 at 04:03
"could you point to some specific thing you find 'confusing' in the grep man page?" - it's confusing why they don't include simple examples ;). I think the broader issue is that linux utilities throw the kitchen sink at you with options, then supply single letter aliases so they can be concisely encrypted on the internet for everyone to then decypher it by googling what on earth a command does. – aaaaaa Jan 30 '18 at 21:16
I have to say that the only reason that google searches are helpful is because there are good man pages and experienced people who can interpret them (or are forced to do so). @corsiKa – geneorama Jul 02 '18 at 17:34
6

@geneorama Sure, but maybe the people who wrote the utlities to begin with would write better man pages and it wouldn't be an issue. The man pages are written for people who already know the tool and just need a little reminder. They're not written for people trying to figure out what they're doing. – corsiKa Jul 02 '18 at 17:36
I’m using GNU grep 3.1. The command: grep “string1” FILE | grep “string2” only finds string2. – Jul 10 '19 at 18:15
If word1 also happened to be part of FILE's name, the result could be not what you want – Randy Lam Jul 15 '20 at 04:17

Jonathan Leffler · Answer 2 · 2015-09-09T16:49:25.547

Prescription

One simple rewrite of the command in the question is:

grep "word1" logs | grep "word2"

The first grep finds lines with 'word1' from the file 'logs' and then feeds those into the second grep which looks for lines containing 'word2'.

However, it isn't necessary to use two commands like that. You could use extended grep (grep -E or egrep):

grep -E 'word1.*word2|word2.*word1' logs

If you know that 'word1' will precede 'word2' on the line, you don't even need the alternatives and regular grep would do:

grep 'word1.*word2' logs

The 'one command' variants have the advantage that there is only one process running, and so the lines containing 'word1' do not have to be passed via a pipe to the second process. How much this matters depends on how big the data file is and how many lines match 'word1'. If the file is small, performance isn't likely to be an issue and running two commands is fine. If the file is big but only a few lines contain 'word1', there isn't going to be much data passed on the pipe and using two command is fine. However, if the file is huge and 'word1' occurs frequently, then you may be passing significant data down the pipe where a single command avoids that overhead. Against that, the regex is more complex; you might need to benchmark it to find out what's best — but only if performance really matters. If you run two commands, you should aim to select the less frequently occurring word in the first grep to minimize the amount of data processed by the second.

Diagnosis

The initial script is:

grep -c "word1" | grep -r "word2" logs

This is an odd command sequence. The first grep is going to count the number of occurrences of 'word1' on its standard input, and print that number on its standard output. Until you indicate EOF (e.g. by typing Control-D), it will sit there, waiting for you to type something. The second grep does a recursive search for 'word2' in the files underneath directory logs (or, if it is a file, in the file logs). Or, in my case, it will fail since there's neither a file nor a directory called logs where I'm running the pipeline. Note that the second grep doesn't read its standard input at all, so the pipe is superfluous.

With Bash, the parent shell waits until all the processes in the pipeline have exited, so it sits around waiting for the grep -c to finish, which it won't do until you indicate EOF. Hence, your code seems to get stuck. With Heirloom Shell, the second grep completes and exits, and the shell prompts again. Now you have two processes running, the first grep and the shell, and they are both trying to read from the keyboard, and it is not determinate which one gets any given line of input (or any given EOF indication).

Note that even if you typed data as input to the first grep, you would only get any lines that contain 'word2' shown on the output.

Footnote:

At one time, the answer used:

grep -E 'word1.*word2|word2.*word1' "$@"
grep 'word1.*word2' "$@"

This triggered the comments below.

What is the use of "$@" can you explain. You have not mention any file name. — Prabhat Kumar Singh, Oct 27 '14 at 09:11
@PrabhatKumarSingh: Inside a shell script, `"$@"` expands to all the arguments passed to the shell script (that haven't been shifted away). It could be a list of file names or it could be empty, in which case `grep` will read from standard input. The original code in the question doesn't mention any file names either. It will read from standard input, therefore. — Jonathan Leffler, Oct 27 '14 at 14:18
ok I understand what $@ means in shell script but I have not seen _script_ mentioned in your answer that's why got confused. — Prabhat Kumar Singh, Oct 27 '14 at 14:25
Another positive thing of this solution is that it works if both words are the same, that means it can also detect whether a word is repeated in a line. The accepted solution doesn't handle this case. +1. — Diego Pino, Jun 18 '16 at 11:11
If you use the `--color=auto` flag, this solution also highlights the results in a better way than when using two greps. — Charles Clayton, May 25 '17 at 18:58
@JonathanLeffler When `grep 'word1.*word2'` is used can it highlight only `word1` and `word2` on the find lines? — alper, Jun 19 '22 at 13:15
@alper — no. You might get the desired result with `grep ‘word1.*word2’ | grep -F -e ‘word1’ -e ‘word2’`, but I've not checked that. — Jonathan Leffler, Jun 19 '22 at 13:59
When you have colored output, then this solution would color `word1,,,everything inbetween...word2`. Is there a way to just color word1 and word2. like when you do `grep 'word1|word2` - that is OR, but analogous like `grep 'word1&word2` that would work? — DenisZ, Jan 13 '23 at 08:50
It certainly wouldn’t be trivial to achieve that, @DenisZ. You would probably need to run the output from this code through a colourizing `grep` looking for just the two words: `grep -E -e ‘word1|word2’`. — Jonathan Leffler, Jan 13 '23 at 13:31
Thanks @JonathanLeffler I posted question and got a few solutions, but looks to be the easiest to use `--color=always`, as you suggested, and just grep word2 in pipe — DenisZ, Jan 13 '23 at 14:04

Colin MacKenzie - III · Answer 3 · 2014-06-03T13:42:39.687

12

you could use awk. like this...

cat <yourFile> | awk '/word1/ && /word2/'

Order is not important. So if you have a file and...

a file named , file1 contains:

word1 is in this file as well as word2
word2 is in this file as well as word1
word4 is in this file as well as word1
word5 is in this file as well as word2

then,

/tmp$ cat file1| awk '/word1/ && /word2/'

will result in,

word1 is in this file as well as word2
word2 is in this file as well as word1

yes, awk is slower.

edited Jun 03 '14 at 13:42

answered Jun 03 '14 at 13:21

Colin MacKenzie - III

1,373
12
13

3

useless use of `cat(1)` – Michael Shigorin Oct 17 '18 at 20:13
2

A single Awk is still likely to be faster than two separate `grep` processes. (But of course the extra [useless `cat`](/questions/11710552/useless-use-of-cat) process would more or less nullify that difference.) – tripleee Apr 27 '19 at 20:37

score 7 · Answer 4 · edited Mar 07 '13 at 14:55

7

The main issue is that you haven't supplied the first grep with any input. You will need to reorder your command something like

grep "word1" logs | grep "word2"

If you want to count the occurences, then put a '-c' on the second grep.

edited Mar 07 '13 at 14:55

Jonathan Leffler

730,956
141
904
1,278

answered Nov 26 '12 at 09:54

sysboy

71
1
2

score 4 · Answer 5 · edited Aug 28 '13 at 11:25

4

You cat try with below command

cat log|grep -e word1 -e word2

edited Aug 28 '13 at 11:25

Taryn

242,637
56
362
405

answered Aug 28 '13 at 08:41

user2724604

89
1
3

5

These commands search for at least one word, not for all. and the cat | is unnecessary, you can give the file as grep last argument – Mat M Jun 10 '16 at 12:40
5

Probably useless use of cat?! – Ganapathy Aug 30 '17 at 11:11

kenorb · Answer 6 · 2019-04-30T18:10:03.150

`git grep`

Here is the syntax using git grep combining multiple patterns using Boolean expressions:

git grep -e pattern1 --and -e pattern2 --and -e pattern3

^{The above command will print lines matching all the patterns at once.}

If the files aren't under version control, add --no-index param.

Search files in the current directory that is not managed by Git.

Check man git-grep for help.

How to grep for two words existing on the same line?

7 Answers7

Prescription

Diagnosis

`git grep`

Linked

Related