1

this loop is taking time because lacks of entry in main_file.

while read line;do  
         cat main_file | grep "$line";
done<file

cat file

pattern_1
pattern_2

cat main_file
  main  pattern_1
  main  pattern_2
  main pattern_2
codeforester
  • 39,467
  • 16
  • 112
  • 140
sam ali
  • 67
  • 1
  • 6
  • The command line grep command does not need you to read each line. It is made to read each line already. So just use the GREP command itself and have it spew out the lines where what you are looking for are. Check out the help file about GREP. It has all of this in there. – Mark Manning Apr 26 '17 at 05:15
  • help me with faster command if possible. – sam ali Apr 26 '17 at 05:17
  • YOU must try it first THEN we help out. You can read about all of the options to GREP online just by Googling it. There is no reason for you to give up that easily. – Mark Manning Apr 26 '17 at 05:22
  • "grep -f file main_file" i have this command but this is also slower – sam ali Apr 26 '17 at 05:29
  • Yeah, I see someone gave that to you below. So much for you reading the man pages on GREP. How about this - If you did it in PHP it would be a lot faster. GREP has a lot of options. It is built to work and work correctly each time but it was not built for speed. PHP only executes what you tell it to execute. Also, it is optimized to handle large chunks of data. So it is a lot faster than GREP (which was created in something like the 1950s.) – Mark Manning Apr 26 '17 at 05:32
  • Let's know how big your files are. – codeforester Apr 26 '17 at 05:32
  • main_file contains 5 milion lines and i cant predict the patteren also in which column its present. @ codeforester – sam ali Apr 26 '17 at 05:49
  • there is a useless use of cat : `cat main_file | grep "$line"` should be written `grep $line main_file` – jehutyy Apr 26 '17 at 11:56

1 Answers1

3

Your current approach is very inefficient - the whole loop could be done in a single grep, with the -f option.

grep -Fxf file main_file
  • -F treats lines in file as strings, not patterns
  • -x looks for exact matching line (if that is what you want)
  • -f file reads the lines from file and looks for them in main_file

The above approach will work well as long as the files are small. For larger files, use awk:

awk 'FNR==NR {hash[$1]; next} $2 in hash' file main_file

For details, look at this post - it had other solutions as well:

Community
  • 1
  • 1
codeforester
  • 39,467
  • 16
  • 112
  • 140
  • here $2 means second column of main_file right.problem is that we don't know in which column the pattern is present in main_file.@codeforester – sam ali Apr 26 '17 at 05:57
  • Drop the -x option from grep in that case. And the awk solution won't work for you unless you have a fixed column to apply the match. – codeforester Apr 26 '17 at 07:57