0

I have a file containing a license header. For example:

/*
 * first line 
 * second line
 */

I would like to search for the exact content only in the first 4 lines of my source files. I tried to use

grep -x -Ff "$license_file" "$c_file"

but unfortunately grep searches every line and not the whole content, so it can find, for example, the line "/*" in the middle of a file. How can I search for the whole content? Thank you!

Ofa
  • 67
  • 1
  • 10

1 Answers1

0

You may find it hard to grep a multi-line pattern. As this answer expalined, you probably need to use pcregrep:

pcregrep -M -f "$license_file" "$c_file"

This however has some problems in it because it neither check the position (file start in your case) nor the pattern is for multi-line form. First, the content in $license_file should be like

line 1\nline2\nline 3\nline 4\n

So if you can edit the content of $license_file to the required format then the problem can be proceeded as

pcregrep -HMf"$license_file" --line-offsets "$c_file" | grep -o "^$c_file:1:" | sed 's/...$//'

The --line-offsets option will report the matching line number, so we grep line number 1 and strip the line number to report the matching filename. (Because we know the line number=1 and content is that in $license_file so I suppose you want to report the matching filenames.)

If you may not edit $license_file you could replace it by sed:

pcregrep -HMf <(sed -En '1h;2,4H;4{x;s/\n|$/\\n/g;p}' "$license_file") --line-offsets "$c_file" | grep -o "^$c_file:1:" | sed 's/...$//'
dibery
  • 2,760
  • 4
  • 16
  • 25
  • Unfortunately pcregrep does not work in my environment. I get the error message "error while loading shared libraries: libpcre.so.1" – Ofa Jan 27 '20 at 10:33
  • Try install `pcre`, `libpcre`, or `libpcre3`. (I think it should be `libpcre3` but you can see if it has a different name in your distribution. – dibery Jan 27 '20 at 14:46
  • Thanks. Now I'm able to use pcregrep. I've edited the license_file but it still can't find the pattern in the given c file. It works when license_file contains one word, but it stops working when the file contains two words. Does it have any problem with spaces? What am I missing? – Ofa Jan 28 '20 at 15:00
  • Um, the `line 1` example above works for me. Can you try generating a text file by `seq -f 'line %.0f' 5 > tmp_file` and putting `line 1\nline2\nline 3\nline 4\n` in the pattern file? – dibery Jan 28 '20 at 16:17
  • Maybe irrelevant, which version of OS & pcregep do you use? My pcregrep is version 8.39 2016-06-14. – dibery Jan 28 '20 at 16:22
  • One more to try: `pcregrep -HM $(cat "$license_file") "$c_file"`. If none works, post sample content so that we can see what's wrong. – dibery Jan 28 '20 at 16:29