I have a ~740000 patterns flat file to grep on a ~30Gb directory
I have either a directory to check ~30Gb either an already shorted file ~3Gb to manage
and I want to make something like
analyse from(patterns) -> directory -> anyline with pattern >> patternfile
so I might use something like :
awk '
BEGIN {
while (getline <"file1") pattern = pattern "|" $0;
pattern = substr(pattern, 2);
}
match($0, pattern) {for(i=1; i<=3; i++) {getline; print}}
' file2 > file3
but it gives only one big output file and not one per pattern found. (each pattern would result in 7 to 15 lines output in total) or in bash something like this* (where VB3 is already a very smaller test file)*
while read ; do grep -i $REPLY VB3.txt > OUT/$REPLY.outputparID.out ; done < listeID.txt
and so on
but a rapid caclulation gives me an estimation of more than 5 days do get results...
how can I do just the same below 2/3 hours maximum or better ? the difficulty here is that I need to get separated results so the grep -F (-f) method cannot work