2

I need to process several image files from a directory (S3 directory), the process is to read the filename (id and type) that is stored in the filename (001_4856_0-P-0-A_.jpg), this file is stored in the moment is invoked the process (im using cron and schedule, it works great) the objetive of the process is to store the info into a database.

I have the process working, it works great but my problem is the number of files that is in the directory, because every second adds a lot more files to the directory, the time spent in the process is about 0.19 sec for file, but the amount of files is huge, about 15,000 per minute is added, so i think a multiple simultaneous process (about 10 - 40 times) of the same original process can do the job.

I need some advice or idea,

First to know how to launch multiple process at the same time of one original process.

Second how to get only the non selected filenames bcause the process takes the filenames with:

  $recibidos = Storage::disk('s3recibidos');

  if(count($recibidos) <= 0)
  {
    $lognofile = ['Archivos' => 'No hay archivos para procesar'];
    $orderLog->info('ImagesLog', $lognofile);
  }else{
    $files = $recibidos->files();

    if(Image::count() == 0)
    {
      $last_record = 1;
    } else{
        $last_record = Image::latest('id')->pluck('id')->first()+1;
    }
    $i=$last_record;
    $fotos_sin_info = 0;
    foreach($files as $file)
    {
      $datos = explode('_',$file);
      $tipos = str_replace('-','',$datos[2]);
      Image::create([
        'client_id' => $datos[0],
        'tipo' => $tipos,
      ]);
      $recibidos->move($file,'/procesar/'.$i.'.jpg');
      $i++;
    }

but i dont figured out how to retrieve only the non selected.

Thanks for your comments.

  • Please be aware that the way you are selecting the `id` for the next image will not work when you execute the script multiple times in parallel. But in my opinion you don't need to prepare the `id` like this anyway, as `Image::create([...])` will return the freshly created object with the `id` generated by the database. So you can simply do: `$image = Image::create(['client_id' => $datos[0], 'tipo' => $tipos]); $recibidos->move($file, "/procesar/{$image->id}.jpg");` and get rid of all the `id` preparation code. Btw., are all your images always `.jpg` as forced by the code? – Namoshek Apr 25 '18 at 16:00
  • Yes you are right, thanks for the advice. – Carlos Moran Apr 25 '18 at 16:26
  • May I ask how the images are finding the way into the s3 storage? Is it through your application or through some other way? – Namoshek Apr 25 '18 at 16:27
  • I have an Android and IOS app that stores images from mobile devices, the laravel app is for process those images, but the flow of this is about 15,000 images per minute and the process is slow. – Carlos Moran Apr 26 '18 at 13:12

1 Answers1

1

Using multi-threaded programming in php is possible and has been discussed on so How can one use multi threading in PHP applications. However this is generally not the most obvious choice for standard applications. A solution for your situation will depend on the exact use-case.

Did you consider a solution using queues? https://laravel.com/docs/5.6/queues

Or the scheduler? https://laravel.com/docs/5.6/scheduling

Mkk
  • 433
  • 3
  • 8