10

I realize that there are a ton of Node modules that provide an async API for parsing JSON, but many of them seem to read the entire file or stream into memory, construct a giant string, and then pass it to JSON.parse(). This is what the second answer to "How to parse JSON using NodeJS?" suggests, and is exactly what the jsonfile module does.

Constructing a giant string is exactly what I want to avoid. I want an API like:

parseJsonFile(pathToJsonFile): Promise

where the Promise that is returned resolves to the parsed JSON object. This implementation should use a constant amount of memory. I'm not interested in any sort of SAX-like thing that broadcasts events as various pieces are parsed: just the end result.

I think jsonparse may do what I want (it clearly includes logic for parsing JSON without using JSON.parse()), but there is no simple example in the README.md, and the one file in the examples directory seems overly complicated.

Community
  • 1
  • 1
bolinfest
  • 3,710
  • 2
  • 27
  • 40

2 Answers2

6

I've written a module that does this: BFJ (Big-Friendly JSON). It exports a bunch of functions that operate at different levels of abstraction, but are all asynchronous and streaming at their core.

At the highest level are two functions for reading from and writing to the file system, bfj.read and bfj.write. They each return a promise, so you call them like this:

var bfj = require('bfj');

// Asynchronously read from a JSON file on disk
bfj.read(path)
  .then(data => {
    // :)
  })
  .catch(error => {
    // :(
  });

// Asynchronously write to a JSON file on disk
bfj.write(path, data)
  .then(data => {
    // :)
  })
  .catch(error => {
    // :(
  });

Also at this level is a function for serializing data to a JSON string, called bfj.stringify:

// Asynchronously serialize data to a JSON string
bfj.stringify(data)
  .then(json => {
    // :)
  })
  .catch(error => {
    // :(
  });

Beneath those are two more generic functions for reading from and writing to streams, bfj.parse and bfj.streamify. These serve as foundations for the higher level functions, but you can also call them directly:

// Asynchronously parse JSON from a readable stream
bfj.parse(readableStream).
  .then(data => {
    // :)
  })
  .catch(error => {
    // :(
  });

// Asynchronously serialize data to a writable stream of JSON
bfj.streamify(data).
  .pipe(writableStream);

At the lowest level there are two functions analagous to SAX parsers/serializers, bfj.walk and bfj.eventify. It's unlikely you'd want to call these directly, they're just the guts of the implementation for the higher levels.

It's open-source and MIT-licensed. For more information, check the readme.

Phil Booth
  • 4,853
  • 1
  • 33
  • 35
  • Phil Booth: the API of BFJ looks great! One seemingly minor thing that I question is the value of `check-types`. BFJ makes only 13 calls into the module, half of which seem like they could be replaced with a `typeof` check. I'm not sure it's worth the extra 20K. Also, it appears that the published version of `check-types` comes with `src/check-types.min.js`, which would save some load time, but the `"main"` in the `package.json` still points to `src/check-types.js`, so you end up loading the bigger version, anyway. – bolinfest Dec 09 '16 at 19:08
2

jsonparse is a streamed json parser, the sample code already shown the minimum of using Node stream.

  1. Change client.request() to fs.createReadStream().
  2. Setup on('data') listeners on the file read stream similar to what's in on('response') in the example.
mikemaccana
  • 110,530
  • 99
  • 389
  • 494
leesei
  • 6,020
  • 2
  • 29
  • 51
  • The lack of documentation for [jsonparse](https://github.com/creationix/jsonparse) leaves me unsettled. There is only one file in `examples` and it doesn't even look like it will work because there are things like `require('./colors')` yet no `colors.js` file. – bolinfest Dec 09 '16 at 18:53