3

say if this is linux shell, what i want to do is:

copy file1 tmp
rename tmp file2

i can do waterfall

function copyFile(cb) {
    child_process.exec('cp file1 tmp', function (error, stdout, stderr) {
        ......
    });
}
async.waterfall([
    copyFile,
    renameFile
], function (error) {
    if (error) {
        //handle readFile error or processFile error here
    }
});

or guess i can do

child_process.execSync('cp file1 tmp");
child_process.execSync('rename tmp file2');

What's the difference please ? e.g performance ? blocking ? Thanks very much !

peteb
  • 18,552
  • 9
  • 50
  • 62
user3552178
  • 2,719
  • 8
  • 40
  • 67
  • Well result is the same as you already know, but 1. version is async IO style, which the best for Node.js and should be better from performance perspective, but 2. version is more readable http://stackoverflow.com/questions/10570246/what-is-non-blocking-or-asynchronous-i-o-in-node-js – gevorg Jun 06 '16 at 19:20
  • for version 1, does it create a separate process ? – user3552178 Jun 06 '16 at 19:55
  • Sure it does, child_process.exec is always creating process. – gevorg Jun 06 '16 at 19:56
  • actually my question is: waterfall creates child process as well ? it has to i guess. – user3552178 Jun 06 '16 at 21:20
  • No `async.waterfall` does not create processes by itself, but if you use `child_process.exec` inside it, it will create process during execution. – gevorg Jun 06 '16 at 21:22
  • oh, async.waterfall doesn't create process by itself, guess the performance is not good. Thanks ! – user3552178 Jun 06 '16 at 21:48

1 Answers1

5

The primary difference here is execSync which would be blocking and exec would be non-blocking. execSync blocks the creating process until the child_process created using execSync returns. exec immediately returns and will return a value if there is one later on and won't block the creating parent process. Otherwise, how they behave outside of the blocking is identical.

async.waterfall is a control flow mechanism that just guarantees that operations are executed in order and chain return values from the first function in the chain to the last function in the chain. If one of the functions passed to async.waterfall contains code that would be blocking then async.waterfall would also be blocked. async.waterfall doesn't make the guarantee that all code executed inside it will be async.

The use of child_process means that this will be executed on a separate process and not on the main process being executed using node. You shouldn't use child_process for control flow as there is overhead associated with creating and destroying a new child process. Unless you're doing some CPU intensive tasks or have a need for a separate process you should avoid this.

If you want to execute things synchronously you can wrap all of your code in a try/catch block but I would definitely say don't use child_process for control flow.

From a performance perspective, both of these approaches are bad as they create a child_process but, exec() would be better as it at least returns immediately to the creating process allowing other code to continue executing. Whenever blocking code is used in Node, you're eliminating the primary benefit of using Node. There are situations where blocking is necessary, like requiring modules, but in most scenarios there is a non-blocking alternative.

As an aside, if you're trying to copy a file you can use the fs module and pipe the original file into a new file in a new destination with a new name. The below approach is async, and doesn't require any external dependencies or control flow library. Additionally, it should be more performant than any of the code you've implemented above.

var fs = require('fs');

function copy (src, dest, callback) {
    var r = fs.createReadStream(src);
    var w = fs.createWriteStream(dest);

    r.pipe(w);

    r.on('error', function() {
        return callback(err);
    });

    r.once('end', function() {
        return callback();
    });
}
peteb
  • 18,552
  • 9
  • 50
  • 62
  • Thanks very much for the detailed info. So you think try/catch is the best method against async/child_process, could you please write some code to demonstrate it ? (i picked file copy as example, in realty it's other commands, so createReadStream won't apply) Many thanks ! – user3552178 Jun 06 '16 at 21:31
  • @user3552178 what are you trying to do that you don't want to use a non-blocking approach? – peteb Jun 06 '16 at 21:32
  • if i understand you correctly, you think child_process.exec() is better than async in my case ? if so, i'm fine with it. I just don't know how to write code in this case for "If you want to execute things synchronously you can wrap all of your code in a try/catch block ". My question might be confusing, since i need to digest what you're telling me, :) Thanks ! – user3552178 Jun 06 '16 at 21:47
  • @user3552178 I'm saying that `async` is better if the code passed in to execute is non-blocking. You should always avoid using `child_process` unless absolutely necessary due to the overhead associated with creating new processes. In node, `try/catch` blocks are always executed synchronously so if you have a bunch of statements you'd like to execute synchronously you can simply do `try { // do whatever you need in here } catch(err) { console.log(err); }` Without knowing the scenario you want to actually execute I can't create a better example than that. – peteb Jun 06 '16 at 21:49
  • in my case, step 2 needs to wait for the result from step 1, so guess it's blocking. Thanks ! – user3552178 Jun 06 '16 at 21:51
  • That's not blocking, thats just needing control flow. You should use [`async.waterfall`](https://github.com/caolan/async#waterfall) in that case. Blocking vs Non-blocking refers to whether or not the process/thread can continue with execution while waiting for something to finish. If it can't continue then it is **blocking**, if it can continue and will check for the value later then it is **non-blocking**. – peteb Jun 06 '16 at 21:53
  • my previous comment could be wrong. All commands i'm running is file related, e.g. edit contents. They need to be run in sequence. But from node.js point of view, they should be non-blocking, if i get this right. I'll read more into your try/catch solution, i like that better. – user3552178 Jun 06 '16 at 21:56
  • You should just use a callback structure or promises then. If you use `try/catch` it will absolutely block. Look through the `fs` module and you'll see you can do whatever you'd like to a file and its contents totally async. – peteb Jun 06 '16 at 21:57