0

So I have a HUGE JSON external file that I want to read in my nodejs project and save the value of a specific key from that JSON file to another external JSON file, where the value itself is one HUGE array.

The general structure of the input JSON :

{
    key1: val1, 
    key2: val2, 
    key3: [val3_1, val3_2, ...],
    key4: {
               key4_1: val4_1,
               key4_2: [val4_2_1, val4_2_2, ...]
          } 
    ...
}

I am not sure if reading line by line, as what I have read almost everywhere to read HUGE JSON, a way to proceed as I want to perform a search operation in a way.

INDRAJITH EKANAYAKE
  • 3,894
  • 11
  • 41
  • 63
Zia
  • 277
  • 4
  • 13
  • You said (in a comment on my deleted answer) that you've tried to use a streaming JSON parser to process this and gotten the error "Error: Top-level object should be an array." I suggest showing that code, because I'm fairly sure it's just that you used the library incorrectly. Obviously lots of JSON files have a top-level object rather than a top-level array. – T.J. Crowder May 12 '19 at 10:37

1 Answers1

0

So, thanks to @T.J.Crowder, I managed to locate a wrong method call I was making. Got a working code now:

const StreamObject = require('stream-json/streamers/StreamObject');
const fs = require('fs');
const _ = require('underscore');
const jsonStream = StreamObject.withParser();

var inputfile = "~Path/5cd792a633e32a6e5e20e56a.geojson";
var outputfile = "~Path/5cd792a633e32a6e5e20e56a.json";

var outstream = fs.createWriteStream(outputfile);
outstream.writable = true;

jsonStream.on('data', ({key, value}) => {
    if (_.difference(['features'], Object.keys(value)).length === 0 ){
        outstream.write(JSON.stringify(Object.values(value['features'])));
    }
});

jsonStream.on('end', () => console.log('Done Export!'));

fs.createReadStream(inputfile).pipe(jsonStream.input);

Basically, all I am doing is reading a geojson file that is stored locally and exporting the array of data['data']['features'] to another external json file. This is just an sample here and geojson and exported array/json might get pretty big.

NOW, although I managed to achieve it by merging different stackoverflow posts, not sure if it's supposed to do the right job and the whole array is not being stored in RAM at once. Especially, the way if statement is being used to write output. Please correct the code, if necessary.

Thanks!

Zia
  • 277
  • 4
  • 13