1

I need to send some images through tesseract, and to save time I want to run tesseract in parallel with up to 6 instances.

I have looked at this question, but cant really figure out how to write the code

How can one use multi threading in PHP applications

All images are fetched from a database, and the results are written back to the specific row in the database together with the rest of the information related to the image

Could anyone link to an example or could anyone write a quick example on how to do the job?

When a process is completed a new one must be started so there will always be up to 6 processes running at the same time?

update

class Command {
    private $descriptorspec;
    
    private $output = '';
    
    public function __construct(){
        $this->descriptorspec = [
            0 => ['pipe', 'r'], // stdin
            1 => ['pipe', 'w'], // stdout
            2 => ['pipe', 'w'], // stderr
        ];
    }
    
    public function output(): string{
        return $this->output;
    }
    
    public function exec(string $syntax): string{
        $process = proc_open($syntax, $this->descriptorspec, $pipes);
        
        $this->output = stream_get_contents($pipes[1]);
        fclose($pipes[1]);
        
        $stderr = stream_get_contents($pipes[2]);
        fclose($pipes[2]);
        
        proc_close($process);
        
        return $stderr;
    }
}

$Cmd = new Command;
$Cmd->exec('tesseract ...');
Community
  • 1
  • 1
clarkk
  • 27,151
  • 72
  • 200
  • 340
  • 1
    What you really want is another server running that you offload these processes too, alternatively if you were using Laravel they have a great Queue runner module to allow you to run side jobs in parallel – Halfpint Aug 10 '16 at 13:47

3 Answers3

0

don't do it. PHP has really shitty support for multithreading. instead, use multi processing. use proc_open & co. -- http://php.net/manual/en/function.proc-open.php

hanshenrik
  • 19,904
  • 4
  • 43
  • 89
  • You are not right. The php kernel works fine with multithreading. But I agree that it is not always necessary. – Maxim Tkach Aug 10 '16 at 14:15
  • @MaximTkach no it doesn't. quoting a few warnings: `When print_r, var_dump and other object debug functions are executed, they do not include recursion protection.` - `Any objects that are intended for use in the multi-threaded parts of your application should extend Threaded.` (without saying why) - `upon starting the context, a class whose static members include connection information for a database server, and the connection itself, will only have the simple connection information copied, not the connection` – hanshenrik Aug 10 '16 at 14:22
  • `Resources: The extensions and functionality that define resources in PHP are completely unprepared for this kind of environment; pthreads makes provisions for Resources to be shared among contexts, however, for most types of resource it should be considered unsafe. Extreme caution and care should be used when sharing resources among contexts.` – hanshenrik Aug 10 '16 at 14:25
  • and at least in PHP5, threading was unavailable when running as an apache/IIS module, for some technical reason. i've thus concluded, PHP has really shitty support for threading. – hanshenrik Aug 10 '16 at 14:27
  • And because of this you are not satisfied the pthread? Multithreading always establishes some restrictions, it can not be considered "really shitty support", I use multi-threading to online web-game, everything works fine. :) – Maxim Tkach Aug 10 '16 at 14:31
  • @MaximTkach when using a thread breaks functions like var_dump, no, i'm not satisfied with the threading support. – hanshenrik Aug 10 '16 at 14:33
  • OK, I understand that you do not like multi-threading, but it really is much faster than the fork – Maxim Tkach Aug 10 '16 at 14:46
  • I have a quick question about `proc_open`.. When you execute a command line with `proc_open` PHP moves on in the script and doesn't wait until the command is completed? – clarkk Aug 11 '16 at 17:19
  • @clarkk correct, it doesn't wait at all. if you want to wait, you'll have to do something like: `while(proc_get_status($h)['running']){sleep(1);}` – hanshenrik Aug 11 '16 at 17:31
  • Ok, so you can check the status? could you write a quick example how to start a process and track it? :) – clarkk Aug 11 '16 at 17:48
  • if you add the link to github in your answer I will accept it – clarkk Aug 14 '16 at 13:19
0

You do not need to use multi-threading. To run the 6 workers, without the need to sync with parent proccess - you can use proc_open

Multithreading is faster, takes up less resources (there is no separate namespace for it), and you can run them 4 times more than the forks, but multithreading requires build PHP with non-ZTS (some extensions do not work with non-ZTS), and requires something you would understand pthread model.

For exampe:

We create 6 worker with proc_open, and Base process listen to die childs (i am use my wrapper, work with Ev li or Event lib or Without their) (child-processes), maybe you like wrapper reactphp (child-processes)

    $child = new ChildProcesses();
    $child->add('<system command>');
    $child->add('<system command>');
    $child->add('<system command>');
    $child->add('<system command>');
    $child->add('<system command>');
    $fails = $child->check(null, function(ChildProcess $process) {
        echo 'Error with chld process';
    });

Or use pthread read official documentatin. Your need recompile PHP in non-ZTS, and install pthread extension. For your problem, I would use Pool

You need look at the first example in this page Pool pthreads

Update:

If you need have all time - six workers. You need create six listeners. And communication between processes use zmq or rabbitmq, gearman or other different queue. Your processes will never die, and the fallen lifted by supervisord.

For example:

$loop = Factory::create();
$context = new Context($loop));
$context->getSocket(\ZMQ::SOCKET_SUB);
$context->connect($host);
$context->subscribe('you_queue');
$context->on('messages', function($messages){
    // get specific data and run operation
});
$loop->run();

Be careful, zmq is not quite queue.

  1. It does not save messages
  2. react / zmq uses EventLib, so actually my example will be asynchronous.

You need use Gearman or RabbitMQ - there have ideal functionality to your task


Regards Maxim

Maxim Tkach
  • 1,607
  • 12
  • 23
  • When a process is completed, how to start a new one so there always will be up to 6 processes running at the same time? – clarkk Aug 10 '16 at 14:26
  • I have decided that you need to do and die. If need always six workers - Each worker must be a listener queue. I am recommending use ZMQ (react/zmq), or Redis to listen new tasks.) – Maxim Tkach Aug 10 '16 at 14:39
0

I wrote an example (base on child-processes). This should solve your problem:

This is parent.php:

$child = new ChildProcesses();
$child->addProcessInstance(new \TkachInc\ChildProcesses\ChildProcess('php worker.php -w1'));
$child->addProcessInstance(new \TkachInc\ChildProcesses\ChildProcess('php worker.php -w2'));
$child->addProcessInstance(new \TkachInc\ChildProcesses\ChildProcess('php worker.php -w3'));
$child->addProcessInstance(new \TkachInc\ChildProcesses\ChildProcess('php worker.php -w4'));
$child->addProcessInstance(new \TkachInc\ChildProcesses\ChildProcess('php worker.php -w5'));
$child->addProcessInstance(new \TkachInc\ChildProcesses\ChildProcess('php worker.php -w6'));
$child->daemon();

And your worker.php (i added example output, you need delete it):

$options = getopt("w:");
if(isset($options['w']))
{
    echo $options['w'].PHP_EOL;

    switch ($options['w'])
    {
        case "1":
            sleep(5);
            break;
        case "2":
            sleep(5);
            break;
        case "3":
            sleep(7);
            break;
        case "4":
            sleep(5);
            break;
        case "5":
            sleep(7);
            break;
        case "6":
            sleep(5);
            break;
    }
}

// your logic to receive from database and write to database

Regards Maxim

Maxim Tkach
  • 1,607
  • 12
  • 23