Count matches per literal string
If you simply want to know how often each pattern appears and each of your pattern is a fixed string (not a regex), use ...
grep -oFf needles.txt haystack.txt | sort | uniq -c
Count matching lines per literal string
Note that above is slightly different from your formulation " I want to know how many lines contain that pattern" as one line can have multiple matches. If you really have to count matching lines per pattern instead of matches per pattern, then things get a little bit trickier:
grep -noFf needles.txt haystack.txt | sort | uniq | cut -d: -f2- | uniq -c
Count matching lines per regex
If the patterns are regexes, you probably have to iterate over the patterns, as grep
's output only tells you that (at least) one pattern matched, but not which one.
# this will be very slow if you have many patterns
while IFS= read -r pattern; do
printf '%8d %s\n' "$(grep -ce "$pattern" haystack.txt)" "$pattern"
done < needles.txt
... or use a different tool/language like awk
or perl
.
Note on overlapping matches
You did not formulate any precise requirements, so I went with the simplest solutions for each case. The first two solutions and the last solution behave differently in case multiple patterns match (part of) the same substring.
grep -f needles.txt
matches each substring at most once. Therefore some matches might be "missed" (interpretation of "missed" depends on your requirements)
- whereas iterating
grep -e pattern1; grep -e pattern2; ...
might match the same substring multiple times.