7

I know egrep has a very useful way of anding two expressions together by using:

egrep "pattern1.*pattern2"|egrep "pattern2.*pattern1" filename.txt|wc -l

However is there an easy way to use egrep's AND operator when searching for three expressions as the permutations increase exponentially as you add extra expressions.

I know the other way going about it using sort|uniq -d however I am looking for a simpler solution.

EDIT:

My current way of search will yield five total results:

#!/bin/bash
pid=$$
grep -i "angio" rtrans.txt|sort|uniq|egrep -o "^[0-9]+ [0-9]+ " > /tmp/$pid.1.tmp
grep -i "cardio" rtrans.txt|sort|uniq|egrep -o "^[0-9]+ [0-9]+ " > /tmp/$pid.2.tmp
grep -i "pulmonary" rtrans.txt|sort|uniq|egrep -o "^[0-9]+ [0-9]+ " > /tmp/$pid.3.tmp
cat /tmp/$pid.1.tmp /tmp/$pid.2.tmp|sort|uniq -d > /tmp/$pid.4.tmp
cat /tmp/$pid.4.tmp /tmp/$pid.3.tmp|sort|uniq -d > /tmp/$pid.5.tmp
egrep -o "^[0-9]+ [0-9]+ " /tmp/$pid.5.tmp|getDoc.mps > /tmp/$pid.6.tmp
head -10 /tmp/$pid.6.tmp

mumps@debianMumpsISR:~/Medline2012$ AngioAndCardioAndPulmonary.script 
1514 Structural composition of central pulmonary arteries. Growth potential after surgical shunts.
1517 Patterns of pulmonary arterial anatomy and blood supply in complex congenital heart disease
with pulmonary atresia
3034 Controlled reperfusion following regional ischemia.
3481 Anaesthetic management for oophorectomy in pulmonary lymphangiomyomatosis.
3547 A comparison of methods for limiting myocardial infarct expansion during acute reperfusion--
primary role of unload

While:

mumps@debianMumpsISR:~/Medline2012$ grep "angio" rtrans.txt|grep "cardio" rtrans.txt|grep "pulmonary" rtrans.txt|wc -l
185

yields 185 lines of text because it is only taking the value of the search in pulmonary instead of all three searches.

Bob
  • 746
  • 3
  • 11
  • 26
  • What does a `sort` have to do with a `grep`? I really didn't get this one. – Rubens Mar 02 '13 at 17:28
  • Your example should read `egrep "pattern1.*pattern2|pattern2.*pattern1" filename.txt` – Olaf Dietsche Mar 02 '13 at 18:07
  • See [Check if all of multiple strings or regexes exist in a file](https://stackoverflow.com/q/49762772/6862601). – codeforester Apr 20 '18 at 02:19
  • @triplee, this is not a duplicate. The presented duplicate searches for multiple patterns in a file, while this question searches for multiple patterns in the same line. – kvantour May 08 '19 at 09:26

3 Answers3

9

how about

grep "pattern1" file|grep "pattern2"|grep "pattern3" 

this will give those lines that contain p1, p2 and p3. but with arbitrary order.

Kent
  • 189,393
  • 32
  • 233
  • 301
  • That however will overlap pattern1, pattern2 and pattern3 giving multiple repeated results for each line. – Bob Mar 02 '13 at 18:32
  • @BobDunakey I didn't get you. can you paste some example input and expected output. so that I can know, what you want to get? – Kent Mar 02 '13 at 19:58
  • added example search in original post. – Bob Mar 02 '13 at 20:23
  • @BobDunakey you should `grep .. file|grep..|grep` , not `grep ..file|grep .. file|grep.. file` – Kent Mar 02 '13 at 21:02
1

The approach of Kent with

grep "pattern1" file|grep "pattern2"|grep "pattern3" 

is correct and it should be faster, just for the record I wanted to post an alternative which uses egrep to do the same without pipping:

egrep "pattern1.*pattern2|pattern2.*pattern1"

which looks for p1 followed by p2 or p2 followed by p1.

Stanislav
  • 2,629
  • 1
  • 29
  • 38
0

The original question is about why his egrep command didn't work.

egrep "pattern1.*pattern2"|egrep "pattern2.*pattern1" filename.txt|wc -l

Kent and Stanislav are correct in pointing out the syntax error by putting the filename.txt up front. But this doesn't address the original problem.

Bob's "current way" (4 years ago) was a multi-command approach to grep out different keywords on different lines. In other words, his script was looking for a set of lines containing any of his search terms. The other proposed solutions would only result in lines containing all of his search terms, which does not appear to be his intent.

Instead, he could use a single line egrep to look for any of the terms, like this:

egrep -e 'pattern1|pattern2' filename.txt
Lawrence
  • 11
  • 3