3

Is there a better way, in Node.js, to accept JSON data than appending chunked input to a string and running JSON.parse() upon it afterward? The server can not assume that the JSON input is valid, and JSON.parse() is the only way I'm aware of to validate JSON data.

Assume the following server accepts:

  • only POST requests
  • only JSON data
  • only with a Content-Type header of "application/json"

.

var server = http.createServer(function(req, res) {
    ....

    var parsedData;
    var inputData = '';

    req.on('data', function(chunk) {
        inputData += chunk;
    });

    req.on('end', function() {
        parsedData = JSON.parse(inputData);
    }

    ....
}

If I am enforcing the input data so strictly, it seems strange to simply append this data to a string and then JSON.parse() it, essentially (assuming proper JSON is inputted) going JSON -> string -> JSON. (Of course, one can not assume the data is valid.)

Bryson
  • 1,186
  • 2
  • 13
  • 26
  • Are you saying that you don't need the `parsedData` output of `JSON.parse` and are just looking to validate `inputData` instead? – JohnnyHK Oct 15 '12 at 03:56
  • Yes. If it's JSON deal with it, if it isn't JSON, error. `JSON.parse()` does this by converting the string back into JSON, but I am wondering if there is a simpler way that just keeps the JSON format. JSON input to string back to `JSON.parse()` does work fine, however. – Bryson Oct 15 '12 at 04:42

1 Answers1

2

What you're doing is fine really. However, a performance tweak might be to just add each chunk to an array and then join the array instead of creating a new string each time.

e.g.

var inputData = [];
var parsedData;

req.on('data', function(chunk{
    inputData.push(chunk); //assuming chunk is from utf-8 stream, if not call chunk.toString()
});

req.on('end', function(){
    try {
       parsedData = JSON.parse(inputData.join(''));
    } catch (err) {
      //handle error    
    }
});

This way you are not reallocating and creating a new string each time with inputData += chunk. Note, this is fine for expected small inputs as V8 optimizes this case quite well.

Now, on the note of bad JSON. You do have options that you may want to look at.

  1. JSONStream - I'm not quite sure how this handles parsing errors, but it's quite nice for large JSON input.

  2. json-scrape - This does really well at filtering out bad input that isn't JSON.

Hopefully this info helps some.

JP Richardson
  • 38,609
  • 36
  • 119
  • 151
  • It seems to me that simply concatenating the string chunks onto the existing string would be faster than pushing to an array, then adding the second step of .join() in the .parse()...No? – Bryson Nov 03 '12 at 09:41
  • 1
    In most high level languages, string concatenation is extremely slow as opposed to String buffer operations, because string concatenation allocates new memory each time the concatenation occurs. In JavaScript, the push to array method was the preferred method, and on the client (browser) it still may be. However, as I said, V8 (Node.js) optimizes this case extremely well. In this case, actually V8 optimizes it even better than the array case. http://stackoverflow.com/questions/7299010/why-is-string-concatenation-faster-than-array-join So, yes you're correct. – JP Richardson Nov 03 '12 at 15:04
  • Only correct by edge case. Thanks very much for this explanation. – Bryson Nov 03 '12 at 20:47