Several hundreds log files are done. Average size of each log is 20 MegaBytes. The task is to find multiple occurences of a certain string in the each log.
How to do it as fast as possible? The present decision
- Script reads as many files as possible to 1GB RAM.
- The big hash table is created.
- strpos() function applied to the each row.
- Result is collected and print afterwards. The bottle neck is (3). Is there any decision to make search faster? Any decision would be accepted: Redis, additional RAM, indexed tables.. Thanks!