3

I have a PHP script that works a lot with large text files, mostly log files. The problem is that most of the time I only want a section of it, from one split point to another. But having to read a 2GB text file only to get a small portion of it is slowing the process down.

Is there any way I can read only parts of the text without having to read the entire file into memory?

The data is stored like this:

|18.05.2013: some log info here...
|19.05.2013: some log info here...
|20.05.2013: some log info here...
|21.05.2013: some log info here...
|22.05.2013: some log info here...
| etc...

So if I wanted only "19.05.2012" I would still have to read all the other text as well. Is there any way I can read only that part?

P.S. A database is not an option, splitting the files into smaller files is also impractical.

George Cummins
  • 28,485
  • 8
  • 71
  • 90
Daniel
  • 319
  • 8
  • 17
  • What OS are you using? – Jens May 24 '13 at 18:20
  • "a database is not an option" - why not? –  May 24 '13 at 18:21
  • If each "record" in the file is variable length, there is no way for you to know how far to skip. If you built some kind of index beforehand you could, but it doesn't seem worth it in this instance. – Sean Bright May 24 '13 at 18:22
  • Well, you have to read through it to read parts. Is `fgets`-ing through it not an option? Memory-wise, it's pretty lightweight, and if you know there aren't entries after a certain point, you can abort pretty quick. Also, if this is a log file, a decent `logrotate` config helps a lot. – Wrikken May 24 '13 at 18:23
  • Similar? http://stackoverflow.com/questions/6733243/reading-a-block-of-lines-in-a-file-using-php – DannyB May 24 '13 at 18:25
  • This may help you http://stackoverflow.com/questions/3686177/php-to-search-within-txt-file-and-echo-the-whole-line – Srikanth Kolli May 24 '13 at 18:26
  • I don't think it will @SrikanthKolli – Toby Allen May 24 '13 at 18:53

1 Answers1

2

I think you are looking for fseek.

You will need, however, to format your data in a way that Xth-character is the beginning of the Yth-data. Practically, if every log can have the same length, this may be an efficient way. Otherwise, you will still need to read every single lines to search for it.

Let's imagine (untested, but only to get you started):

function getDataFromFile($fileName, $start, $length) {
    $f_handle = fopen($filename, 'r');
    fseek($f_handle, $start);
    $str = fgets($length);
    fclose($f_handle);
    return $str;
}

Then:

$fname='myfile.txt';
$DATA_LENGTH = 50;
$wanted_data = 12;

$data = getDataFromFile($fname, $DATA_LENGTH*$wanted_data, $DATA_LENGTH);

I hope this helps.

Frederik.L
  • 5,522
  • 2
  • 29
  • 41