I need to egrep from a large csv file with 2 million lines, I want to cut down the egrep time to 0.5 sec, is this possible at all? no, I don't want database (sqlite3 or MySQL) at this time..
$ time wc foo.csv
2000000 22805420 334452932 foo.csv
real 0m3.396s
user 0m3.261s
sys 0m0.115s
I've been able to cut down the run time from 40 secs to 1.75 secs
$ time egrep -i "storm|broadway|parkway center|chief financial" foo.csv|wc -l
108292
real 0m40.707s
user 0m40.137s
sys 0m0.309s
$ time LC_ALL=C egrep -i "storm|broadway|parkway center|chief financial" foo.csv|wc -l
108292
real 0m1.751s
user 0m1.590s
sys 0m0.140s
but I want the egrep real time to be less than half a second, any tricks will be greatly appreciated, the file changes continuously, so I can't use any cache mechanism...