0

I need to write a script, that takes an array of values and multithreaded way it (forks?) runs another script with a value from array as a param, but so max running forks would be set, so it would wait for script to finish if there are more than n running already. How do I do that?

There is a plugin named child_process, but not sure how to get it done, as it always waits for child termination.

Basically, in PHP it would be something like this (wrote it from head, may contain some syntax errors):

<php
    declare(ticks = 1);
    $data = file('data.txt');

    $max=20;
    $child=0;

    function sig_handler($signo) {
        global $child;
        switch ($signo) {
                case SIGCHLD:
                $child -= 1;
        }
    }

    pcntl_signal(SIGCHLD, "sig_handler");

    foreach($data as $dataline){
        $dataline = trim($dataline);
        while($child >= $max){
            sleep(1);
        }

        $child++;

        $pid=pcntl_fork();

        if($pid){
            // SOMETHING WENT WRONG? NEVER HAPPENS!             
        }else{
            exec("php processdata.php \"$dataline\"");
            exit;
        }//fork
    }

    while($child != 0){
        sleep(1);
    }
?>
Flash Thunder
  • 11,672
  • 8
  • 47
  • 91
  • Sorry for the question, but it's a must every time someone asks for multithreading stuff in Node.js: are you SURE you need do that on a separate thread? Node.js is extremely fast at performing parallel operations even without multithreading. Unless you need to do a lot of calculations without never accessing any i/o buffer, then you may not need multiple threads. – ItalyPaleAle Oct 23 '14 at 12:53
  • If your "processdata" script performs actions that interact with a database, a file or a remote server, then in Node you probably won't need to fork it to another thread. – ItalyPaleAle Oct 23 '14 at 12:55
  • To be honest, I don't really need it to be multithreaded, as far as it does the same... so it may be on callbacks... but it has to be separate script that is being called. Not hard-coded. – Flash Thunder Oct 23 '14 at 12:55
  • What does "processdata" do in your PHP code, at the moment? – ItalyPaleAle Oct 23 '14 at 12:56
  • At the moment, it connects to few servers, gathers information and writes it to `mongodb`. It does exec some unix commands too (`nslookup` for example)... I need to rewrite it to `node.js` too, but thats not what's the question about :) – Flash Thunder Oct 23 '14 at 12:56
  • Ok so in that case, with Node, you will likely NOT need a separate thread. Indeed, unlike PHP, Node has a loop that performs other actions while it's waiting for external i/o (for example, getting data from remote servers, communicating with mongo, and even waiting for `nslookup` to finish! - ps check http://nodejs.org/api/dns.html :) ). Since multithreading in Node is quite painful, I'd suggest you to try with the single thread. In the rare case it's not giving you adequate performance, then upgrade to multithreading! – ItalyPaleAle Oct 23 '14 at 13:00
  • Ok it may be on callbacks as far as it execs external command. At beginning, it will just run php.... so I need node.js to do what is said in the question, no matter how. I mean it doesn't really have to be real "multithreading"... array contains about 1000000 elements. – Flash Thunder Oct 23 '14 at 13:02

1 Answers1

1

After the conversation in the comments, here's how to have Node executing your PHP script.

Since you're calling an external command, there's no need to create a new thread. The Node.js runloop understands that calls to external commands are async operations, and it can execute all of them at the same time.

You can see different ways for executing an external process in this SO question (linked answer may be the best in your case).

However, since you're already moving everything to Node, you may even consider rewriting your "process.php" script to Node.js code. Since, as you explained, that script connects to remote servers and databases and uses nslookup (which you may not really need with Node.js), you won't need any separate thread: they're all async operations that Node.js excels at performing.

Community
  • 1
  • 1
ItalyPaleAle
  • 7,185
  • 6
  • 42
  • 69