0

I have a directory with 100 of xlxs files. Now what I want to do is to convert all these files into PDF all at one time or some at one time. The conversion process is working fine at the moment with foreach and cron. But it can process or convert files one at a time which increase waiting time at the user end who is waiting for PDF files.

I am thinking about parallel processing at this time but don't know how to implement this.

Here is my current code

$files = glob("/var/www/html/conversions/xlxs_files/*");

if(!empty($files)){
  $now   = time();
  $i = 1;
   foreach ($files as $file) {
      if (is_file($file) && $i <= 8) {
        
        echo $i.'-----'.basename($file).'----'.date('m/d/Y H:i:s',@filemtime($file));
        echo '<br>';
        $path_parts = pathinfo(basename($file));
        
        $xlsx_file_name =  basename($file);
        
        $pdf_file_name =  $path_parts['filename'].'.pdf';
        
        echo '<br>';  
        
        try{
            $result = ConvertApi::convert('pdf', ['File' => $common_path.'xlxs_files/'.$xlsx_file_name],'xlsx');
            echo $log = 'conversion start for '.basename($file).' on '. date('d-M-Y h:i:s');
            echo '<br>';
            $result->getFile()->save($common_path.'pdf_files/'.$pdf_file_name);
            
            echo $log = 'conversion start for '.basename($file).' on '. date('d-M-Y h:i:s'); 
            echo '<br>';
            mail('amit.webethics@gmail.com','test','test');
            unlink($common_path.'xlxs_files/'.$xlsx_file_name);
            
        }catch(Exception $e){
            $log_file_data = createAlogFile();
            $log = 'There is an error with your file .'. $xlsx_file_name.' -- '.$e->getMessage();
            file_put_contents($log_file_data, $log . "\n", FILE_APPEND);
            continue;
        }
        $i++;  
    }
}
}else{

   echo 'nothing to process';
} 

Any help will be highly appreciated. Thanks

user3666197
  • 1
  • 6
  • 50
  • 92
Amit Sharma
  • 1,775
  • 3
  • 11
  • 20

2 Answers2

1

You could start multiple PHP scripts at a time. How to do that in detail answer is here: https://unix.stackexchange.com/a/216475/91593 I would go for this solution:

N=4
(
for thing in a b c d e f g; do 
   ((i=i%N)); ((i++==0)) && wait
   task "$thing" & 
done
)

Another way is to try to use PHP for that. There is in depth answer to this question: https://stackoverflow.com/a/36440644/625521

Jonas
  • 4,683
  • 4
  • 45
  • 81
  • can we do this without bash script ? – Amit Sharma Sep 21 '20 at 06:02
  • Yes, please use the second solution using PHP. This article will put you on a right track https://medium.com/hootsuite-engineering/parallel-processing-task-distribution-with-php-2630c5a51fc4 – Jonas Sep 21 '20 at 06:12
  • thanks but I do not have any thing like database at the moment. I just have the directory of the files and I want all of them convert into the PDF files with causing much time gap. – Amit Sharma Sep 21 '20 at 06:15
1

Q : I am thinking about parallel processing at this time but don't know how to implement this.

Fact #1:
this is not a kind of a true-[PARALLEL] orchestration of the flow of processing.

Fact #2:
a standard GNU parallel (all details kindly read in man parallel) will help you maximise the performance of your processing pipeline, given the list of all files to convert and tweaking other parameters as the amounts of CPU/cores used and RAM-resources you may reserve/allocate to perform this batch conversion as fast as possible.

ls _files_to_convert.mask_ | parallel --jobs _nCores_  \
                                      --load 99%        \
                                      --block _RAMblock_ \
                                      ...                 \
                                      --dry-run            \
                                      _converting_process_

might serve as an immediate apetiser for what the GNU parallel is capable of.

All credits and thanks are to go to Ole Tange.

user3666197
  • 1
  • 6
  • 50
  • 92