2

I have a Laravel Command that periodically uploads some file to a remote S3 disk (DigitalOcean Spaces, S3 compatible and uses the Laravel/Flysystem S3 driver).

The problem is that the command crashes after moving around 100 files because the server runs out of resources/file pointers. When I dump out the resources I can see that the number of resources increase by about 5 for each file/each iteration and the server runs out of available file pointers, and PHP can't open more resources.

Setting a high limit with ulimit -n 999999 helps, but that is not really solving the problem.

<?php

$localDisk     = Storage::disk('some_local_disk');
$localFiles    = $localDisk->allFiles();
$localBaseDir  = $localDisk->path('');
$remoteDisk     = Storage::disk('some_remote_disk'); // DigitalOcean Spaces (S3 compatible)
$remoteBaseDir = 'some/folder';


// Check resources before we start
dump(get_resources('stream'), count(get_resources('stream')));

foreach ($localFiles as $file) {
    // Skip dotfiles
    if (Str::startsWith($file, '.')) {
        continue;
    }

    $resourcesOpened = count(get_resources('stream'));

    $localLocation = $localBaseDir.$file;
    $remoteLocation = $remoteBaseDir.$file;

    $fileHandle = fopen($localLocation, 'ab+');
    $remoteDisk->put($remoteLocation, $fileHandle); // Guess this is creating resources, but why aren't they closed? How can I close them?

    $localDisk->delete($file); // Thought this would be enough
    fclose($fileHandle);


    // Check resources after each iteration
    dump('___________________', get_resources('stream'), count(get_resources('stream')));

    usleep(5000); // Pause a little to reduce the load
}

How can I close the resources that are opened up? Is there a better way to do this? I was thinking about maybe using an S3 CLI client, but I would like to avoid that.

Pelmered
  • 2,727
  • 21
  • 22

1 Answers1

3

Have you tried using Laravel Storage's automatic streaming

With Collection - Chunk
//use Illuminate\Http\File;
//use Illuminate\Support\Str;
//use Illuminate\Support\LazyCollection;
//use Illuminate\Support\Facades\Storage;

$localDisk = Storage::disk('some_local_disk');
$remoteDisk = Storage::disk('some_remote_disk'); // DigitalOcean Spaces (S3 compatible)
$remoteBaseDir = 'some/folder';

LazyCollection::make($localDisk->allFiles())
    /** Remove dotfiles */
    ->reject(fn($filePath) => Str::startsWith($filePath, '.'))
    /** Split into chunks of 50 files */
    ->chunk(50)
    ->each(fn($chunk) => $chunk->each(function($filePath) use($localDisk, $remoteDisk, $remoteBaseDir){
        
        $file = new File($localDisk->path($filePath));

        /** putFile will store the file with a unique id (hash) & suffix with extension */
        $remoteDisk->putFile(
            /** path to store file */
            $remoteBaseDir, 
            /** File to upload */
            $file,
            /** set visibility - optional */
            'public'
        );

        /** ====== OR ====== */

        /** To store the file with it's original name instead of Laravel generated id(hash)
         * use putFileAs method
         */
        $remoteDisk->putFileAs(
            /** path to store the file */
            $remoteBaseDir,
            /** File to upload (stream) */
            $file,
            /** Store the file on remote disk with this filename */
            $file->getFileName(),
            /** Set visibility - optional */
            'public'
        );
    });
With Queued Job for each chunk of 50 files

If even that's not enough, you can dispatch a job which uploads 50 files at a time (chunks of 50 files)

<?php

namespace App\Jobs;

use Illuminate\Http\File;
use Illuminate\Bus\Queueable;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Foundation\Bus\Dispatchable;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Queue\SerializesModels;
use Illuminate\Contracts\Filesystem\Filesystem;

class UploadFilesToSpaces implements ShouldQueue
{
    use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

    protected $filePaths;

    public function__construct(
        Collection $filePaths,
        Filesystem $localDisk,
        Filesystem $remoteDisk,
        string $remoteBaseDir
    )
    {
        $this->filePaths = $filePaths;
        $this->localDisk = $localDisk;
        $this->remoteDisk = $remoteDisk;
        $this->remoteBaseDir = $remoteBaseDir;
    }


    public function handle()
    {
        $this->filePaths->each(function($path){
            $file = new File($this->localDisk->path($filePath));

            /** putFile will store the file with a unique id (hash) & suffix with extension */
            $this->remoteDisk->putFile(
                /** path to store file */
                $this->remoteBaseDir, 
                /** File to upload */
                $file,
                /** set visibility - optional */
                'public'
            );

            /** ====== OR ====== */

            /** To store the file with it's original name instead of Laravel generated id(hash)
             * use putFileAs method
             */
            $this->remoteDisk->putFileAs(
                /** path to store the file */
                $this->remoteBaseDir,
                /** File to upload (stream) */
                $file,
                /** Store the file on remote disk with this filename */
                $file->getFileName(),
                /** Set visibility - optional */
                'public'
            );
        });
    }      

}

And in Command class

//use Illuminate\Support\Str;
//use Illuminate\Support\LazyCollection;
//use Illuminate\Support\Facades\Storage;


$localDisk = Storage::disk('some_local_disk');
$remoteDisk = Storage::disk('some_remote_disk'); // DigitalOcean Spaces (S3 compatible)
$remoteBaseDir = 'some/folder';

LazyCollection::make($localDisk->allFiles())
    ->reject(fn($filePath) => Str::startsWith($filePath, '.'))
    ->chunk(50)
    ->each(fn($chunk) => UploadFilesToSpaces::dispatch($chunk, $localDisk, $remoteDisk, $remoteBaseDir));

Laravel Docs - Filesystem - Automatic Streaming

Donkarnash
  • 12,433
  • 5
  • 26
  • 37
  • Thanks! LazyCollection worked better. The resources still ticks up every iteration, but it seems like there is some garbage collection kicking in sometimes that clear the old streams. It looks promising so far. – Pelmered Jun 14 '22 at 08:24
  • @Pelmered You can try rate limiting queue in combination to creating **UploadFilesToSpaces** jobs per chunk of 50 files. Queuing the jobs with rate limiting will reduce performance bottlenecks. https://laravel.com/docs/9.x/queues#rate-limiting – Donkarnash Jun 15 '22 at 00:44