0

I have a CSV file with 5mn rows. I am using CSV library from thephpleague.com. It uses generators so reads the file line by line. So reading does not cause any problem.

Yet, I want to save those to the DB using CI model because I need to make sure that data integrity will be okay. So I want to use model's validators, beforeInsert callbacks, etc. So question is not how to insert data to db, but how to insert data by processing with its Model.

Here is my code:

namespace App\Util;

use App\Models\TestModel;
use League\Csv\Reader;

class CSV
{
    public function readData(string $filePath, array $headers = [])
    {
      
        $filePath = ROOTPATH.'test.5mn.csv';

        $maxExecutionTime = ini_get("max_execution_time");
        ini_set('max_execution_time', 300);

        $reader = Reader::createFromPath($filePath, 'r');
        $reader->setHeaderOffset(0); //set the CSV header offset
        $reader->skipEmptyRecords();

        if (true === empty($headers)) {
            $headers = $reader->getHeader();
        }

        $records = $reader->getRecords($headers);

        $isGoodToGo = true;
    
        foreach ($records as $record) {
            if ($this->isEmptyWithNullValues($record)) {
                continue;
            }

            if ($this->isEmptyWithBlankString($record)) {
                continue;
            }

            $model = model(TestModel::class);
            $model->save([
                'item_type' => $record["Item Type"],
                'sales_channel' => $record["Sales Channel"],
                'total_profit' =>  $record["Total Profit"],
            ]);
            log_message('error', '#'.$model->getInsertID() . ' saved');
            $model = null;
        }

        ini_set('max_execution_time', $maxExecutionTime);

        return $isGoodToGo;
    }
    ....
} 

On controller I am calling this:

public function index()
{
    $test = new CSV();
    $test->readData('path', []);
    dump('EXIT');
} 

But I gives php memory error after a certain number of records. In dev environment, it can save only 123456 records, after throws memory error. In production, it saves 253321 records then throws memory error.

There is an overhead in somewhere I presume, yet I am not an expert. In model, or in connection, or in anywhere, is there a method which I can flush the memory.. I am thinking CI keeping some data even in production environment. Any idea would be very helpful. Thank you..

For reference; error message:

CRITICAL - 2022-01-06 11:48:09 --> Allowed memory size of 134217728 bytes exhausted (tried to allocate 67108872 bytes)
#0 [internal function]: CodeIgniter\Debug\Exceptions->shutdownHandler()
#1 {main}

RiggsFolly
  • 93,638
  • 21
  • 103
  • 149
YahyaE
  • 1,057
  • 1
  • 9
  • 24
  • Try creating the `$model` outside the loop and of course removing the `$model = null;` from the loop also – RiggsFolly Jan 06 '22 at 11:18
  • @RiggsFolly Creating `$model` outside and removing `$model = null` do not make a difference. – YahyaE Jan 06 '22 at 11:54
  • Does this answer your question? [Allowed memory size of 33554432 bytes exhausted (tried to allocate 43148176 bytes) in php](https://stackoverflow.com/questions/415801/allowed-memory-size-of-33554432-bytes-exhausted-tried-to-allocate-43148176-byte) – steven7mwesigwa Jan 07 '22 at 05:40
  • Does this answer your question? [Fatal Error: Allowed Memory Size of 134217728 Bytes Exhausted (CodeIgniter + XML-RPC)](https://stackoverflow.com/questions/561066/fatal-error-allowed-memory-size-of-134217728-bytes-exhausted-codeigniter-xml) – steven7mwesigwa Jan 07 '22 at 05:42
  • Does this answer your question? [Advice on returning results from a large query returning error- Allowed memory size of 734003200 bytes exhausted](https://stackoverflow.com/questions/71515456/advice-on-returning-results-from-a-large-query-returning-error-allowed-memory-s/71516397#71516397) – steven7mwesigwa Apr 16 '22 at 15:08

2 Answers2

0

How big is your csv file in MB?

Reason is, if your memory limit is 128mb and your file is 100mb (which I presume from your filename you have some millions of records there) you already consumed most of your memory only by loading the csv file.

I believe there isn't much to do with the code except setting memory limit to -1 and cross your fingers. It may work or crash again.

My suggestion and what would I do?

That amount of records should not be inserted to DB in one go anyway. Doing job in batches is the best practice. When you have 30mn or 100mn records, batches will still work but not this one.

Ergec
  • 11,608
  • 7
  • 52
  • 62
  • File is in GB. Yet, "It uses generators so reads the file line by line. So reading does not cause any problem." – YahyaE Jan 07 '22 at 06:29
0

I got an answer in CI forum. CI Project Lead mentioned that:

"I have had to do this in the past. The way I handled it there was to use the LOAD DATA command to load the CSV into a new, temporary table. Then would loop over the temp table doing validation and normalization of the data before inserting it into the correct table(s). And, yes, chunk() is a perfect use case for that once you have it in the temp table."

Posting it here for future reference. Thanks.

YahyaE
  • 1,057
  • 1
  • 9
  • 24