3

I am writing an artisan console command that loops through all the records in a table and regenerates a field on that table.

The field is a hash and is generated as an md5() of a particular string.

Initially my code looked like this:

// Get all recipes
$recipes = Recipe::all();

$hashProgress = $this->output->createProgressBar(count($recipes));

// Loop over each recipe and generate a new hash for it
foreach ($recipes as $recipe)
{
    $hashString = '';

    $hashString .= $recipe->field1;
    $hashString .= $recipe->field2;
    $hashString .= $recipe->field3;
    $hashString .= $recipe->field4;
    $hashString .= $recipe->field5;
    $hashString .= $recipe->field6;
    $hashString .= $recipe->field7;

    $extras1Total = $recipe->extras1->sum('amount');
    $hashString .= $recipe->extras1->reduce(function ($str, $item) use ($extras1Total) {
        return $str . $item->name . ($extras1Total == 0 ? $item->amount : ($item->amount / $extras1Total * 100));
    }, '');

    $extras2Total = $recipe->extras2->sum('amount');
    $hashString .= $recipe->extras2->reduce(function ($str, $item) use ($extras2Total) {
        return $str . $item->name . ($extras2Total == 0 ? $item->amount : ($item->amount / $extras2Total * 100));
    }, '');

    $extras3Total = $recipe->extras3->sum('amount');
    $hashString .= $recipe->extras3->reduce(function ($str, $item) use ($extras3Total) {
        return $str . $item->name . ($extras3Total == 0 ? $item->amount : ($item->amount / $extras3Total * 100));
    }, '');

    $extras4Total = $recipe->extras4->sum('amount');
    $hashString .= $recipe->extras4->reduce(function ($str, $item) use ($extras4Total) {
        return $str . $item->name . ($extras4Total == 0 ? $item->amount : ($item->amount / $extras4Total * 100));
    }, '');

    $recipe->update([
        'hash' => md5($hashString),
    ]);

    $hashProgress->advance();
}

$hashProgress->finish();
$this->info(' Recipe hashes regenerated.');

After getting to around 10,000 of 28,000 records it would die with a memory exhausted error:

PHP Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 4096 bytes)

I thought chunking this might help:

// Get all recipes
$recipes = Recipe::all();

$hashProgress = $this->output->createProgressBar(count($recipes));

// Loop over each recipe and generate a new hash for it
foreach ($recipes->chunk(1000) as $chunk)
{
    foreach ($chunk as $recipe)
    {
        $hashString = '';

        $hashString .= $recipe->field1;
        $hashString .= $recipe->field2;
        $hashString .= $recipe->field3;
        $hashString .= $recipe->field4;
        $hashString .= $recipe->field5;
        $hashString .= $recipe->field6;
        $hashString .= $recipe->field7;

        $extras1Total = $recipe->extras1->sum('amount');
        $hashString .= $recipe->extras1->reduce(function ($str, $item) use ($extras1Total) {
            return $str . $item->name . ($extras1Total == 0 ? $item->amount : ($item->amount / $extras1Total * 100));
        }, '');

        $extras2Total = $recipe->extras2->sum('amount');
        $hashString .= $recipe->extras2->reduce(function ($str, $item) use ($extras2Total) {
            return $str . $item->name . ($extras2Total == 0 ? $item->amount : ($item->amount / $extras2Total * 100));
        }, '');

        $extras3Total = $recipe->extras3->sum('amount');
        $hashString .= $recipe->extras3->reduce(function ($str, $item) use ($extras3Total) {
            return $str . $item->name . ($extras3Total == 0 ? $item->amount : ($item->amount / $extras3Total * 100));
        }, '');

        $extras4Total = $recipe->extras4->sum('amount');
        $hashString .= $recipe->extras4->reduce(function ($str, $item) use ($extras4Total) {
            return $str . $item->name . ($extras4Total == 0 ? $item->amount : ($item->amount / $extras4Total * 100));
        }, '');

        $recipe->update([
            'hash' => md5($hashString),
        ]);

        $hashProgress->advance();
    }
}

$hashProgress->finish();
$this->info(' Recipe hashes regenerated.');

But I am still getting a memory exhaustion error.

How can I loop through all these records and achieve what I am after without increasing the memory limit?

James
  • 15,754
  • 12
  • 73
  • 91
  • You are referring to a bunch of functions under recipe, I suspect you have a memory leak in there, look for public variables. – Forbs Nov 14 '17 at 23:30

1 Answers1

8

The way you're "chunking" is actually consuming more memory than the initial code.

What you're doing is getting all the records at once, storing them in $recipes and then chunking the results by calling the chunk() on the resulted collection.

Instead, you need to call the method with the same name, chunk(), on the underlying Recipe model's query builder and generate hashes chunk by chunk:

Recipe::chunk(1000, function ($recipies) {
    // Hash generation logic here
});

This way, you eliminate having a huge $recipes variable which I'm sure is the bottleneck here. Depending on the available memory, you might need to tweak the chunk size a bit to avoid memory exhaustion.

Also, I'd try to use fewer variables when generating the hash instead of leaving a trail of $extras1Total, extras2Total, ... variables. All of them can be replaced with a $total that will be rewritten over and over. This is micro-optimization though.

P.S. In case of significant database write stress (which is rare with 28k total), you might want considering to do the final updates in one (or few) go(es) instead of doing it per record.

sepehr
  • 17,110
  • 7
  • 81
  • 119
  • Perfect, I've made the changes as you suggest; by using `chunk()` straight on the model and then have also replaced the individual totals with one `$total` that just gets reused. With regards to doing mass updates as opposed to doing it per record. How would I go about this, just set the hash value and mass update the collection at the end? – James Nov 15 '17 at 00:02
  • 1
    I suggest doing it only if needed. [YAGNI](https://martinfowler.com/bliki/Yagni.html) may apply. Doing [bulk inserts](https://stackoverflow.com/questions/12702812/bulk-insertion-in-laravel-using-eloquent-orm) is easy. However, to do bulk updates you need to change your code a bit in order to utilize [transactions](https://laravel.com/docs/5.5/database#database-transactions). Read more about it [here](https://stackoverflow.com/a/45909598/65732). Also, have a look at this [little awesome package](https://github.com/cossay/bulk-update-query-writer) to do it in just one big query. – sepehr Nov 15 '17 at 00:30