1

i have in my Laravel App this code: i read every line in a .csv file and a want to update a value. But the multiple's update query are very slow with a .csv of 8k rows. How i can speed this code ? Thanks

DB::beginTransaction();

        try {

            $delimiter = ",";
            $firstLine = true;

            if ($handle !== FALSE) {
                $position = 1;
                while (($csv_line = fgetcsv($handle, 1000, $delimiter)) !== FALSE) {

                    if ($firstLine == true) {
                        $firstLine = false;
                        continue;
                    }

                    $player_uid = $csv_line[0];

                    DB::table('scores')
                        ->where('season_uid', $season_uid)
                        ->where('day', $day)
                        ->where('player_uid', $player_uid)
                        ->update(['position' => $position]);

                    $position++;

                }
                fclose($handle);
            }

            DB::commit();
            return true;

        } catch (\Exception $e) {
            Log::error($e);
            DB::rollBack();
            return false;
        }
Gigignu
  • 11
  • 1
  • 2
    I'm voting to close this question as off-topic because it belongs to CodeReview – VLAZ Jun 11 '19 at 11:26
  • .. but mine educated guess would be to say that the scores table does not have a correct index.. `SHOW CREATE TABLE scores` – Raymond Nijland Jun 11 '19 at 11:36
  • From looking at the code I would bet that the performance bottleneck lies in the execution of `SELECT`s which are probably generated by calling `DB::table()->where()->...` Using [Xdebug profiler](https://xdebug.org/docs/profiler) to profile this should definitelly shed more light into this. – Dejv Jun 11 '19 at 11:37
  • How mach is "very slow"? And how fast would you like it to be? – Paul Spiegel Jun 11 '19 at 11:55

2 Answers2

0

MySQL doesn't support mass updates, but there is a neat trick to replace updates with inserts using ON DUPLICATE KEY UPDATE clause. That way you can actually update your records in bulks. Check out this answer for some examples.

To my knowledge though Laravel doesn't support this clause in it's query builder, so you will have to generate query manually and issue it via DB::statement(). Make sure to chunk your incoming rows (by 100, for example) and you will see noticeable boost in speed.

But do realize that updating 8k rows is not a cheap operation. The best practice would be to delegate it to a separate job and set up queues in your application so that those updates can be processed in the background by your workers separately. You can read more about jobs and queues in the official documentation.

d3jn
  • 1,392
  • 3
  • 13
  • 21
  • *"MySQL doesn't support mass updates"* i never encounterd that as MySQL can deal with that fine in a transaction and indexed columns to use for the WHERE clause in the update statement to prevent scanning the table all the time. – Raymond Nijland Jun 11 '19 at 11:40
  • @RaymondNijland my bad, should've used a better term there - what I mean is that you can't bulk multiple `UPDATE ... WHERE` queries together as a single query. In theory it could've still benefited from all queries having the same WHERE clauses and utilize indices as a result to speed up the update without all the overhead splitting it into separated queries gives you. – d3jn Jun 11 '19 at 11:51
  • 1
    *"what I mean is that you can't bulk multiple UPDATE ... WHERE queries together as a single query"* Well in fact you can if all updates run on the same table then you can use a `CASE END` statement – Raymond Nijland Jun 11 '19 at 11:54
  • 1
    @RaymondNijland thanks for the insight, didn't know about such constructs. I've updated my answer so it won't confuse anyone. – d3jn Jun 11 '19 at 11:57
  • 1
    *"thanks for the insight, didn't know about such constructs"* no problem take a look into [this](https://www.db-fiddle.com/f/iWkuX4LPMsiFfU24eFruQP/0) for a example – Raymond Nijland Jun 11 '19 at 12:05
0

Using separate jobs to do it would be the recommended way indeed, but you could give the following code a try. The idea of creating a single update query was found on https://github.com/laravel/ideas/issues/575 . The guy decreased the loading time, in the end being ~13x faster.

Please note that it wasn't tested before.

DB::beginTransaction();

try {
    $csv = array_map('str_getcsv', file('data.csv'));

    // remove the first line
    array_shift($csv);

    // grab only the players uids and their positions
    $positions = array_flip(array_column($csv, 0));

    array_walk($positions, static function(&$position, $id) {
        $position = "WHEN {$id} THEN {$position}";
    });

    DB::update("UPDATE `scores` 
                SET `position` = CASE `player_uid` " . implode(' ', $positions) . " END 
                WHERE `player_uid` in (" . implode(',', array_keys($positions)) . ") 
                  AND `session_uid` = ? 
                  AND `day` = ?", [$season_uid, $day]);

    DB::commit();

    return true;

} catch (\Exception $e) {
    Log::error($e);
    DB::rollBack();

    return false;
}

PS: It would be nice to write a comment about the performance change with this approach

Mihai Matei
  • 24,166
  • 5
  • 32
  • 50