3

I have a bash function that goes through a list of files and greps for a certain word and writes it into a file. The problem is I get every instance of the word, I just want to get the first instance. I came across a solution by adding head -1 with the grep, but now, my function just hangs when I call it.

 83 function processAppLogs {
 84         for i in `find $log_des -name $fname`
 85             do
 86                 p=$i
 87                 d=${p//applog.txt/unmountTimestampList.txt}
 88                 grep "UNMOUNTED"  $i >> $d
 89                 grep "PATIENT ID" | head -1 | $i >> $d
 90             done
 91 }

I'm looking to grep only the first instance of "PATIENT ID" but I think I might have the syntax wrong? Is that the proper way to grep the first instance and write that to a file?

fedorqui
  • 275,237
  • 103
  • 548
  • 598
cyberbemon
  • 3,000
  • 11
  • 37
  • 62
  • So what was the question here: how can I make this code work or how can I grep just once? `grep ... | head` will grep all the file and then get the first match, whereas `grep -m1` is more optimal since it stops as soon as it finds the first match. – fedorqui Jun 11 '15 at 10:28

3 Answers3

11

You can use -m 1 to indicate that you just want to match once:

grep -m 1 "PATIENT ID" >> "$d"

From man grep:

-m NUM, --max-count=NUM

Stop reading a file after NUM matching lines.

Test

$ seq 8 12
8
9
10
11
12
$ seq 8 12 | grep 1 -m 1
10
$ seq 8 12 | grep 1 -m 2
10
11
fedorqui
  • 275,237
  • 103
  • 548
  • 598
  • 1
    Much more efficient this; `grep` will terminate at the earliest possible instant and you save on an additional shell `fork` and `exec` compared to the currently accepted answer. – cueedee Feb 28 '20 at 09:00
2

As a side not to @fedorqui's answer.

What you got wrong is the order of commands. The head must occur after the grep and then should the redirection come. Like

grep "PATIENT ID" $i | head -1  >> $d
nu11p01n73R
  • 26,397
  • 3
  • 39
  • 52
1

The answer is to use -m, to specify a maximum number of matches, if your version of grep supports it. Otherwise, piping the output to head will work. head will exit after it reaches the number of lines specified, breaking the pipe and causing grep to exit, so the outcome will be the same (albeit at the expense of a pipe).

A portable alternative using a single process would be to use awk:

awk '/PATIENT ID/ {print; exit}' "$i"

Loops like for i in `find $log_des -name $fname` should be avoided, as they rely on well-behaved filenames (no spaces or glob characters present) and break otherwise.

If you're relying on find to do a recursive directory search, then use globstar instead, as it allows you to do what you want safely.

shopt -s globstar
for i in **/"$fname"; do
    d=${i//applog.txt/unmountTimestampList.txt}
    { grep "UNMOUNTED" "$i"; grep -m 1 "PATIENT ID" "$i" } >> "$d"
done

As a bonus I've also redirected a single block, rather than each command separately (perhaps you can use > instead now?) and safely quoted your variables.

Tom Fenech
  • 72,334
  • 12
  • 107
  • 141
  • Nice explanation! Note you could also probably do `while read ... do; done < <(find ...)` – fedorqui Jun 11 '15 at 10:33
  • @fedorqui thanks. The approach using `while read` still misses the (pathological) case of a newline in a filename and relies on external tools, so I still think that this is the best way in bash. The other safe alternative would be to use `-print0` with `find` but that's non-standard too. – Tom Fenech Jun 11 '15 at 10:35
  • Good point. I just started a bounty in [how to loop list of file names returned by find](http://stackoverflow.com/q/9612090/1983854) so that we can have a canonical answer that we can always refer. – fedorqui Jun 11 '15 at 10:41