1

I want to use grep to search for matching strings in the file, but because the file is too big, I only search for the first 500 lines.

I wrote in the shell script:

#!/bin/bash

patterns=(
llc_prefetcher_operat
to_prefetch
llc_prefetcher_cache_fill
)


search_file_path="mix1-bimodal-no-bop-lru-4core.txt"
echo ${#patterns[*]}
cmd="head -500 ${search_file_path} | grep -a "
for(( i=0;i<${#patterns[@]};i++)) do
cmd=$cmd" -e "\"${patterns[i]}\"
done;
echo $cmd
$cmd >junk.log

The result of running the script is:

3
head -500 mix1-bimodal-no-bop-lru-4core.txt | grep -a -e "llc_prefetcher_operat" -e "to_prefetch" -e "llc_prefetcher_cache_fill"
head: invalid option -a
Try'head --help' for more information.

On the penultimate line, I printed out the string of the executed command. I ran it directly on the command line and it was successful. That is the following sentence.

 head -500 mix1-bimodal-no-bop-lru-4core.txt | grep -a -e "llc_prefetcher_operat" -e "to_prefetch" -e "llc_prefetcher_cache_fill"

Note that in the grep command, if I do not add the -a option, there will be a problem of matching the binary file.

Why does this problem occur? Thank you!

Yujie
  • 395
  • 2
  • 12
  • `head: invalid option -a` so head is seeing the -a, not grep - suggest perhaps the `|` needs to be escaped (just a guess) – John3136 Jul 09 '21 at 03:40
  • Read [BashFAQ #50](https://mywiki.wooledge.org/BashFAQ/050). Repeat until you're convinced that shell commands should __never__ be stored in strings. – Charles Duffy Jul 09 '21 at 04:07
  • (...if you're thinking about using `eval` to work around the problem, then read [BashFAQ #48](https://mywiki.wooledge.org/BashFAQ/048) in addition). – Charles Duffy Jul 09 '21 at 04:08
  • @John3136, adding more escaping doesn't fix the problem -- the problem is that the results of parameter expansions, command substitutions, etc. are only subject to string-splitting and glob expansion, but no other parsing steps, so a `|` created by one of those operations will _never_ be treated as syntax (unless one does something that restarts the parsing process from the beginning, but that way lie shell injection vulnerabilities) – Charles Duffy Jul 09 '21 at 04:09
  • @CharlesDuffy - as I said, just a guess. Key part for me is that it was "head" not "grep" complaining about the -a. – John3136 Jul 09 '21 at 04:13

1 Answers1

1

Instead of trying to build a string holding a complex command, you're better off using grep's -f option and bash process substitution to pass the list of patterns to search for:

head -500 "$search_file_path" | grep -Faf <(printf "%s\n" "${patterns[@]}") > junk.log

It's shorter, simpler and less error prone.

(I added -F to the grep options because none of your example patterns have any regular expression metacharacters; so fixed string searching will likely be faster)


The biggest problem with what you're doing is the | is treated as just another argument to head when $cmd is word split. It's not treated as a pipeline delimiter like it is when a literal one is present.

Shawn
  • 47,241
  • 3
  • 26
  • 60