0

I have a huge amount of records I want to upload into MySQL database in my Laravel app, in order to avoid uploading duplicates I want to check if name there is already a model in database with the same name, I'm doing this check with Eloquent but given I'll be handling large amount of data I was wondering if there is a more performant way to do this, my objective is to perform the less DB operations possible!

My code:

public function uploadIntoDatabase(Request $request)
    {
        $venuesToUpload = $request['venuesToUpload'];

        $allVenues = Venue::all();

        foreach($venuesToUpload as $index => $venue)
        {
            $alreadyAdded = Venue::where('name', $venue['name'])->first();

            if(!$alreadyAdded)
            {
                $newVenue = new Venue();
                $newVenue->name = $venue['name'];
                $newVenue->save();
            }
        }

        return response()->json([
            'message' => 'All venues uploaded',
        ]);
    }
halfer
  • 19,824
  • 17
  • 99
  • 186
gabogabans
  • 3,035
  • 5
  • 33
  • 81
  • Don't use a SELECT as a means to determine if a subsequent INSERT will create a duplicate. Instead just create a unique constraint on the field(s) that determine uniqueness and then check the result of the insert command for failures. – Alex Howansky Aug 31 '20 at 16:25
  • Please check my answer [here](https://stackoverflow.com/questions/63661078/laravel-how-to-create-unique-tag-in-tags-model/63661192#63661192) `updateOrCreate` will save time – Arman Aug 31 '20 at 16:29
  • `updateOrCreate()` performs a SELECT first, it's pretty much the same thing OP is already doing. – Alex Howansky Aug 31 '20 at 16:39
  • Well that question related to query optimization then, OP can use raw mysql query read [here](https://stackoverflow.com/questions/548541/insert-ignore-vs-insert-on-duplicate-key-update) for clues, or `perform the less DB operations possible` by two queries 1.getting all duplicate data ids and remove those data for insert 2. use one bulk insert of unique data – Arman Aug 31 '20 at 17:45

2 Answers2

0

In order to simplify a bit your method - you could use firstOrcreate eloquent method: https://laravel.com/docs/7.x/eloquent#other-creation-methods

public function uploadIntoDatabase(Request $request)
    {
        $venuesToUpload = $request['venuesToUpload'];

        $allVenues = Venue::all();

        foreach($venuesToUpload as $venue)
        {
          Venue::FirstOrCreate(['name' => $venue['name']]);
        }

        return response()->json([
            'message' => 'All venues uploaded',
        ]);
    }

You could also approach this via the raw SQL query builder using transactions, but you would have to build the query beforehand, which would make for a more complex code!

reference: https://mariadb.com/kb/en/transactions/ ; https://laravel.com/docs/7.x/database#database-transactions Guideline that you may follow:

  • Iterate over all venues and store them into an array OR prepare an SQL statement while iterating over them, such that it could fetch all the records from the database, where the name is matching.
  • Remove all the records from the $venue array
  • Start a transaction.
  • Run all the insertions.
  • Commit the transaction.

Last but not least, you could make a queue, and insert data via chunks with the FirstOrCreate method, while going throught each of the records. This would not keep the date up to date at all times and would require cache store like Redis.

Reference: https://laravel.com/docs/7.x/queues

  • `firstOrcreate()` performs a SELECT prior to each INSERT, will not reduce the number of queries required, and will not guarantee uniqueness. – Alex Howansky Aug 31 '20 at 17:26
  • Additionally, you can't really rely on something like `name` for uniqueness... But that goes above and beyond the problem a `unique` constraint at the database level would handle. – Tim Lewis Aug 31 '20 at 17:37
  • How about using validation? Maybe uniquely identify the venues to upload then validate their uniqueness against the data in your database table – F KIng Aug 31 '20 at 18:23
  • @AlexHowansky - Just to clarify - I suggested using the `firstOrCreate()` method as it would make the method cleaner, not claiming it would fix the delay. – Daniel Yonkov Sep 01 '20 at 09:43
  • 1
    @TimLewis - True, but the colleague here is using it as a pivot point to decide wether the records are unique (see his code above). I agree that index could fix the issue, but we should not take for granted that he has access to add or modify in any way the structure of the database. – Daniel Yonkov Sep 01 '20 at 09:45
0

In order

to perform the less DB operations possible

you can do the following:

  1. Get data that already exist at once.
  2. Prepare unique data to bulk insert.

Code:

public function uploadIntoDatabase(Request $request)
{
    $venuesToUpload = $request['venuesToUpload']->pluck('name');

    $dupes = Venue::select('name')->whereIn('name', $venuesToUpload)->get()->pluck('name')->toArray();
    $diff = array_diff($venuesToUpload, $dupes);

    $prepInsert = [];
    foreach($diff as $i => $val){
        $prepInsert[$i]['name'] = $val; //name should be $fillable in Venue model
    }
    Venue::insert($prepInsert);

    return response()->json([
        'message' => 'All venues uploaded',
    ]);
}

That is faster than you do now, but if it's not what you want you can use raw mysql option I offered in the comment to your question.

Arman
  • 481
  • 4
  • 12