2

I'm scraping some data from Instagram and then creating a record of it in the database, but sometimes, the record will be inserted twice (or more). When I add an echo in the code it only prints/echos once, so I'm not sure why sometimes the records insert multiple times. Here's the code:

public function handle()
{

    #Define variables
    $website_id = null;
    $user_id = null;
    $avg_dataset_start = null;
    $avg_dataset_end = null;

    try {
        $api = new \Instagram\Api();
        $api->setUserName($this->username);
        $obj = $api->getFeed();

        #quick hack to convert to nested obj
        $data = json_decode(json_encode($obj));

        $data->videos_count = 0;
        $data->pictures_count = 0;
        $data->avgPostLikes = 0;
        $data->avgPostComments = 0;
    } catch (Exception $e) {
        return $e;
    }

    $account = InstagramAccount::where('instagram_id', $data->id)->orWhere('username', $data->userName)->first();

        $scraped_data = InstagramAccountScrape::create([
            'instagram_account_id' => $account->id,
            'username' => $data->userName,
            'full_name' => $data->fullName,
            'biography' => $data->biography,
            'profile_picture_url' => $data->profilePicture,
            'external_url' => $data->externalUrl,
            'website_id' => $website_id,
            'media_count' => $data->mediaCount,
            'followers_count' => $data->followers,
            'following_count' => $data->following,
            'avg_likes_count' => $data->avgPostLikes,
            'avg_comments_count' => $data->avgPostComments,
            'avg_dataset_start' => $avg_dataset_start,
            'avg_dataset_end' => $avg_dataset_end,
            'avg_dataset_photos_count' => $data->pictures_count,
            'avg_dataset_videos_count' => $data->videos_count,
            'is_private' => $data->private,
            'is_verified' => $data->verified,
            'user_id' => $user_id,
        ]); #Sometimes adds more than one record

    echo "Print once please"; #prints once, as expected

    return response()->json($scraped_data); #returns only one instance
}
Michał
  • 868
  • 1
  • 10
  • 36
  • `create` only creates one entry in the database, you are probably calling the `handle` method multiple times, most likely through multiple requests, that is why you only see one `echo`. how are you triggering this method? – Remul Sep 20 '19 at 13:34
  • @Remul thanks for the reply! Like so from the controller: ```return ScrapeInstagramAccount::dispatchNow($username)```. The bizarre thing is if I move the entire code out of a job and into a controller, it still runs multiple times, *sometimes*... – Michał Sep 20 '19 at 13:37
  • @Remul I have added another arbitrary `create` before (for a different model) and it also sometimes runs multiple times, the same amount of times as the one in question... I'm not sure where to go debugging from here. I was thinking it could have something to do with the execution time? – Michał Sep 20 '19 at 13:44
  • Too verify that it is run multiple times you could add `\Log::info("run for account: {$account->id}");` instead of the echo and then check your log file. – Remul Sep 20 '19 at 13:47
  • @Remul so telescope is showing me one request. Though interestingly when I change the username, on the first scrape it logs 2 or more requests, and on subsequent scrapes of the same account it logs 1 request. – Michał Sep 20 '19 at 13:51
  • How long does it take for the handle method to execute? It might be a problem with your `retry_after` value in `config/queue.php`, you could try increasing it. [Docs](https://laravel.com/docs/5.8/queues#job-expirations-and-timeouts) – Remul Sep 20 '19 at 13:57
  • @Remul hmm but my queue driver is `sync`, and the issue also happens if I just move the job code directly into the controller, hence bypassing the queue all-together. Anyway, increased 10x but to no avail. – Michał Sep 20 '19 at 14:00

1 Answers1

2

Figured it out, and it was very strange indeed.

It was caused by a style="background: url('')", where the url was empty, so the browser assumed it was referencing the same page and caused it to load again, sometimes multiple times until the browser decided to stop the reload/load madness.

It happened on all the pages as it was in the layout blade template.

Special thanks to the thread here: MVC controller is being called twice

Michał
  • 868
  • 1
  • 10
  • 36