Is there any faster way to grep the pattern or to increase the speed of while loop?

Question

this loop is taking time because lacks of entry in main_file.

while read line;do  
         cat main_file | grep "$line";
done<file

cat file

pattern_1
pattern_2

cat main_file
  main  pattern_1
  main  pattern_2
  main pattern_2

The command line grep command does not need you to read each line. It is made to read each line already. So just use the GREP command itself and have it spew out the lines where what you are looking for are. Check out the help file about GREP. It has all of this in there. — Mark Manning, Apr 26 '17 at 05:15
YOU must try it first THEN we help out. You can read about all of the options to GREP online just by Googling it. There is no reason for you to give up that easily. — Mark Manning, Apr 26 '17 at 05:22
"grep -f file main_file" i have this command but this is also slower — sam ali, Apr 26 '17 at 05:29
Yeah, I see someone gave that to you below. So much for you reading the man pages on GREP. How about this - If you did it in PHP it would be a lot faster. GREP has a lot of options. It is built to work and work correctly each time but it was not built for speed. PHP only executes what you tell it to execute. Also, it is optimized to handle large chunks of data. So it is a lot faster than GREP (which was created in something like the 1950s.) — Mark Manning, Apr 26 '17 at 05:32
main_file contains 5 milion lines and i cant predict the patteren also in which column its present. @ codeforester — sam ali, Apr 26 '17 at 05:49
there is a useless use of cat : `cat main_file | grep "$line"` should be written `grep $line main_file` — jehutyy, Apr 26 '17 at 11:56

score 3 · Accepted Answer · edited May 23 '17 at 12:10

3

Your current approach is very inefficient - the whole loop could be done in a single grep, with the -f option.

grep -Fxf file main_file

-F treats lines in file as strings, not patterns
-x looks for exact matching line (if that is what you want)
-f file reads the lines from file and looks for them in main_file

The above approach will work well as long as the files are small. For larger files, use awk:

awk 'FNR==NR {hash[$1]; next} $2 in hash' file main_file

For details, look at this post - it had other solutions as well:

Fastest way to find lines of a text file from another larger text file in Bash

edited May 23 '17 at 12:10

Community

1
1

answered Apr 26 '17 at 05:27

codeforester

39,467
16
112
140

here $2 means second column of main_file right.problem is that we don't know in which column the pattern is present in main_file.@codeforester – sam ali Apr 26 '17 at 05:57
Drop the -x option from grep in that case. And the awk solution won't work for you unless you have a fixed column to apply the match. – codeforester Apr 26 '17 at 07:57

Is there any faster way to grep the pattern or to increase the speed of while loop?

1 Answers1