0

The structure of json file is as follows:

{
    "products": [
    {
        "id": 6672129786814,
        "title": "rhPQbUW2fK5bhVCFuNFPsBGdolZYYcJ9gp4D4gskBHCOmGWb54",
        "variants":[{...},{...}],
        ....
    },{
        "id": 6672129786824,
        "title": "yuhPQbUW2fK5bhVCFfgdfgdfsglZYYcJ9gp4D4gskBHCOmGWb54",
        "variants":[{...},{...}],
        ....
    },{
        "id": 6672129786842,
        "title": "dfgUW2fK5bhVCFuNfgdsfgolZYYcJ9gp4D4gskBHCOmGWb54",
        "variants":[{...},{...}],
        ....
    },{
        "id": 6672129786935,
        "title": "aayuy44fK5bhVCFuNFPsBGdolZYYcJ9gp4D4gskBHCOmGWb54",
        "variants":[{...},{...}],
        ....
    }]
}

In this json file there can be 500000 number of objects. I need to search product basis on the product_id . I know that we can read file using streaming, and it is also working fine. it gives me result as all objects of json file. But Now I need to search specific product basis on the product_id. Here I know that during getting all products and I can use the loop iteration and search the specific product. but I don't think that it is a efficient way to search.

I am looking for solution to search the specific object basis on the value of id, during the reading of file so that I can search quickly and get that specific object, rather than get all objects once and then iterate and then match the id and get that object.

var data = ''
var reader_stream = fs.createReadStream(file_path) //Create a readable stream
reader_stream.setEncoding('UTF8')

reader_stream.on('data', function(chunk) {
  data += chunk
})

reader_stream.on('end',function() {
  const products = JSON.parse(data)
  resolve(products)
})

reader_stream.on('error', function(err) {
  console.log(err.stack)
  reject(err.stack)
})

If I use chunk, it is not sure it consider the complete object from objects [as it is chunk]. so How should I read data object by object? Can Anyone please provide the solution using which I can get that specific object quickly?

Deep Kakkar
  • 5,831
  • 4
  • 39
  • 75

1 Answers1

0

If you can trust on your file having exactly the structure shown in your example ( { first line with id, rest of lines, } ) then you can use https://stackoverflow.com/a/32599033/2729605 to read line by line, and check your desired id in the loop (maybe a regex?) then you you can keep on reading lines until you find another id or the end of the file, and discard those extra lines before building the final object. Adapted from there:

const regexStart = /"id" *: +6672129786842/
const regexStop = /"id"/

const fs = require('fs');
const readline = require('readline');

async function processLineByLine() {
  const fileStream = fs.createReadStream('input.txt');

  const rl = readline.createInterface({
    input: fileStream,
    crlfDelay: Infinity
  });
  // Note: we use the crlfDelay option to recognize all instances of CR LF
  // ('\r\n') in input.txt as a single line break.

  for await (const line of rl) {
    // Each line in input.txt will be successively available here as `line`.
    if ( regexStart.test(line) ) {
    // start storing subsequent lines
      for await (const line of rl) {
         if ( regexStop.test(line) ) {
         // produce final output and break
         }
      }
    }
  }
}

processLineByLine();

If your structure is more variable this is still doable, but you need to store somewhere the lines read before finding an id and to discard them if they were not part of the desired object.

malarres
  • 2,941
  • 1
  • 21
  • 35