0

Several hundreds log files are done. Average size of each log is 20 MegaBytes. The task is to find multiple occurences of a certain string in the each log.

How to do it as fast as possible? The present decision

  1. Script reads as many files as possible to 1GB RAM.
  2. The big hash table is created.
  3. strpos() function applied to the each row.
  4. Result is collected and print afterwards. The bottle neck is (3). Is there any decision to make search faster? Any decision would be accepted: Redis, additional RAM, indexed tables.. Thanks!
Vivek Singh
  • 2,453
  • 1
  • 14
  • 27
Volodymyr Nabok
  • 415
  • 1
  • 4
  • 11
  • Possible duplicate of [What is the fastest way to find the occurrence of a string in another string?](http://stackoverflow.com/questions/5821483/what-is-the-fastest-way-to-find-the-occurrence-of-a-string-in-another-string) – Elias Sep 13 '16 at 08:40
  • maybe interesting? http://sphinxsearch.com/info/studies/ – Ryan Vincent Sep 13 '16 at 10:41
  • yes, thanks. As I understand no way to do fast search without external tools. Isn't it? – Volodymyr Nabok Sep 14 '16 at 12:22

0 Answers0