23

I have a PhantomJS/CasperJS script which I'm running from within a node.js script using process.spawn(). Since CasperJS doesn't support require()ing modules, I'm trying to print commands from CasperJS to stdout and then read them in from my node.js script using spawn.stdout.on('data', function(data) {}); in order to do things like add objects to redis/mongoose (convoluted, yes, but seems more straightforward than setting up a web service for this...) The CasperJS script executes a series of commands and creates, say, 20 screenshots which need to be added to my database.

However, I can't figure out how to break the data variable (a Buffer?) into lines... I've tried converting it to a string and then doing a replace, I've tried doing spawn.stdout.setEncoding('utf8'); but nothing seems to work...

Here is what I have right now

var spawn = require('child_process').spawn;

var bin = "casperjs"
//googlelinks.js is the example given at http://casperjs.org/#quickstart
var args = ['scripts/googlelinks.js'];
var cspr = spawn(bin, args);

//cspr.stdout.setEncoding('utf8');
cspr.stdout.on('data', function (data) {
    var buff = new Buffer(data);
    console.log("foo: " + buff.toString('utf8'));
});

cspr.stderr.on('data', function (data) {
    data += '';
    console.log(data.replace("\n", "\nstderr: "));
});

cspr.on('exit', function (code) {
    console.log('child process exited with code ' + code);
    process.exit(code);
});

https://gist.github.com/2131204

hippietrail
  • 15,848
  • 18
  • 99
  • 158
Jesse Fulton
  • 2,188
  • 1
  • 23
  • 28
  • 1
    Is this the best approach? It seems like the `stdout.on('data')` event fires depending upon buffer size, not necessarily new lines. Is this true? – Jesse Fulton Mar 20 '12 at 04:19

7 Answers7

20

Try this:

cspr.stdout.setEncoding('utf8');
cspr.stdout.on('data', function(data) {
  var str = data.toString(), lines = str.split(/(\r?\n)/g);
  for (var i=0; i<lines.length; i++) {
    // Process the line, noting it might be incomplete.
  }
});

Note that the "data" event might not necessarily break evenly between lines of output, so a single line might span multiple data events.

maerics
  • 151,642
  • 46
  • 269
  • 291
  • Weird, I'm on OSX - I thought "\r\n" was Windows. But it seems to work! (after adding a handful of missing parentheses :p) – Jesse Fulton Mar 20 '12 at 04:31
  • 2
    @JesseFulton: the `\r` is optional from the regex special character `?` so this code should work on both UNIX and Windows; it's making the regular expression global (`.../g`) that was probably critical here. The call to "replace" in your sample code used a plain string which gets converted into a non-global regex, so you probably got just two lines instead of all of them. – maerics Mar 20 '12 at 05:06
  • Ah, yea you're right. String.replace(String, String) isn't global - you need to use a regex as the first param and add the 'g' switch. – Jesse Fulton Mar 20 '12 at 05:57
  • @mehaase: which comment? – maerics Nov 20 '13 at 16:26
  • While this solution sometimes works, I'd like to see the one which always works. – Puma Aug 01 '17 at 22:20
13

I've actually written a Node library for exactly this purpose, it's called stream-splitter and you can find it on Github: samcday/stream-splitter.

The library provides a special Stream you can pipe your casper stdout into, along with a delimiter (in your case, \n), and it will emit neat token events, one for each line it has split out from the input Stream. The internal implementation for this is very simple, and delegates most of the magic to substack/node-buffers which means there's no unnecessary Buffer allocations/copies.

Sam Day
  • 1,713
  • 9
  • 17
12

I found a nicer way to do this with just pure node, which seems to work well:

const childProcess = require('child_process');
const readline = require('readline');

const cspr = childProcess.spawn(bin, args);

const rl = readline.createInterface({ input: cspr.stdout });
rl.on('line', line => /* handle line here */)

nyctef
  • 474
  • 6
  • 12
2

Adding to maerics' answer, which does not deal properly with cases where only part of a line is fed in a data dump (theirs will give you the first part and the second part of the line individually, as two separate lines.)

var _breakOffFirstLine = /\r?\n/
function filterStdoutDataDumpsToTextLines(callback){ //returns a function that takes chunks of stdin data, aggregates it, and passes lines one by one through to callback, all as soon as it gets them.
    var acc = ''
    return function(data){
        var splitted = data.toString().split(_breakOffFirstLine)
        var inTactLines = splitted.slice(0, splitted.length-1)
        var inTactLines[0] = acc+inTactLines[0] //if there was a partial, unended line in the previous dump, it is completed by the first section.
        acc = splitted[splitted.length-1] //if there is a partial, unended line in this dump, store it to be completed by the next (we assume there will be a terminating newline at some point. This is, generally, a safe assumption.)
        for(var i=0; i<inTactLines.length; ++i){
            callback(inTactLines[i])
        }
    }
}

usage:

process.stdout.on('data', filterStdoutDataDumpsToTextLines(function(line){
    //each time this inner function is called, you will be getting a single, complete line of the stdout ^^
}) )
mako
  • 1,201
  • 14
  • 30
0

You can give this a try. It will ignore any empty lines or empty new line breaks.

cspr.stdout.on('data', (data) => {
    data = data.toString().split(/(\r?\n)/g);
    data.forEach((item, index) => {
        if (data[index] !== '\n' && data[index] !== '') {
            console.log(data[index]);
        }
    });
});
Rick
  • 12,606
  • 2
  • 43
  • 41
0

Old stuff but still useful...

I have made a custom stream Transform subclass for this purpose.

See https://stackoverflow.com/a/59400367/4861714

Julio
  • 339
  • 2
  • 6
0

@nyctef's answer uses an official nodejs package.

Here is a link to the documentation: https://nodejs.org/api/readline.html

The node:readline module provides an interface for reading data from a Readable stream (such as process.stdin) one line at a time.

My personal use-case is parsing json output from the "docker watch" command created in a spawned child_process.

  const dockerWatchProcess = spawn(...)
  ...
  const rl = readline.createInterface({
    input: dockerWatchProcess.stdout,
    output: null,
  });

  rl.on('line', (log: string) => {
    console.log('dockerWatchProcess event::', log);
    // code to process a change to a docker event
    ...
  });
jsgresham
  • 1
  • 3