1

I have a Laravel 5.3 project. I need to import and parse a pretty large (1.6M lines) text file.

I am having memory resource issues. I think at some point, I need to use chunk but am having trouble getting the file loaded to do so.

Here is what I am trying;

    if(Input::hasFile('file')){
        $path = Input::file('file')->getRealPath(); //assign file from input
        $data = file($path); //load the file
        $data->chunk(100, function ($content) { //parse it 100 lines at a time
            foreach ($content as $line) {
                //use $line
            }
        });
    }

I understand that file() will return an array vs. File::get() which will return a string.

I have increased my php.ini upload and memory limits to be able to handle the file size but am running into this error;

Allowed memory size of 524288000 bytes exhausted (tried to allocate 4096 bytes)

This is occurring at the line;

$data = file($path);

What am I missing? And/or is this the most ideal way to do this?

Thanks!

RushVan
  • 363
  • 4
  • 20
  • 1
    `file($path)` is going to load the entire file into memory, that's your main issue. Use fgets instead to read the file one line at a time. – Devon Bessemer Jun 12 '18 at 15:24
  • Devon, you were right. If you submit as an answer, I'll give you the credit. – RushVan Jun 12 '18 at 17:33
  • So now that it is running, what's the best practice to stop it from timing out? I can extend the max execution time but is there a more delicate way? – RushVan Jun 12 '18 at 17:34

2 Answers2

1

As mentioned, file() reads the entire file into an array, in this case 1.6 million elements. I doubt that is possible. You can read each line one by one overwriting the previous one:

$fh = fopen($path "r");
if($fh) {
    while(($line = fgets($fh)) !== false) {
        //use $line
    }
}

The only way to keep it from timing out is to set the maximum execution time:

set_time_limit(0);
AbraCadaver
  • 78,200
  • 7
  • 66
  • 87
0

If file is too large, you need split your file without php, you can use exec command safely, if you want use just with php interpreter, you need many memory and it need long time, linux commands save your time for each run.

exec('split -C 20m --numeric-suffixes input_filename output_prefix');

After that you may use Directory Iterator and read each file.

Regards

Kenan Duman
  • 154
  • 1
  • 5