3

A JSON file is 6 GB. When reading it with the following code,

var fs = require('fs');
var contents = fs.readFileSync('large_file.txt').toString();

It had the following error:

buffer.js:182
    throw err;
    ^

RangeError: "size" argument must not be larger than 2147483647
    at Function.Buffer.allocUnsafe (buffer.js:209:3)
    at tryCreateBuffer (fs.js:530:21)
    at Object.fs.readFileSync (fs.js:569:14)
    at Object.<anonymous> (/home/readHugeFile.js:4:19)
    at Module._compile (module.js:569:30)
    at Object.Module._extensions..js (module.js:580:10)
    at Module.load (module.js:503:32)
    at tryModuleLoad (module.js:466:12)
    at Function.Module._load (module.js:458:3)
    at Function.Module.runMain (module.js:605:10)

Could somebody help, please?

Alessio Cantarella
  • 5,077
  • 3
  • 27
  • 34
  • Possible duplicate of [Node.js read big file with fs.readFileSync](https://stackoverflow.com/questions/29766868/node-js-read-big-file-with-fs-readfilesync) – Kukic Vladimir Jul 09 '17 at 09:20
  • Possible duplicate of [What's the maximum size of a Node.js Buffer](https://stackoverflow.com/questions/8974375/whats-the-maximum-size-of-a-node-js-buffer) – Evan Carroll Feb 24 '19 at 23:27

2 Answers2

8

The maximum size for a Buffer, which is what readFileSync() uses internally to hold the file data, is about 2GB (source: https://nodejs.org/api/buffer.html#buffer_buffer_kmaxlength).

You probably need a streaming JSON parser, like JSONStream, to process your file:

const JSONStream = require('JSONStream');
const fs         = require('fs');

fs.createReadStream('large_file.json')
  .pipe(JSONStream.parse('*'))
  .on('data', entry => {
    console.log('entry', entry);
  });
robertklep
  • 198,204
  • 35
  • 394
  • 381
  • Hi @robertklep, I am writing a CLI app, where I need to parse a big JSON file, and respond to the user. Your code does the job, but it's asynchronous. Is there a recommended way of working with streams synchronously? Thanks – mils Jun 14 '18 at 05:35
  • @mils the main feature that streams provide in this example is being able to parse the files incrementally. In your case, it sounds like you want to read/parse the file in one go, which will require multi-GB of RAM. Are you sure that's what you want? AFAIK, there aren't any synchronous stream implementation (and besides that, streams are inherently event-based). – robertklep Jun 14 '18 at 06:33
  • Hi @robertklep, I'm basically doing multiple passes over the file. The first pass gets a unique set of ids from the JSON file, then for each id I perform another pass, retrieving deeper data. This is how I keep memory use low. But I need to do step 2 after step 1 is completely finished (i.e. `for(id in ids)`), and return results to the user. I guess maybe I'm just a n00b with javascript. Is there a "classic" way of solving the sync/async problem? Thanks again – mils Jun 14 '18 at 07:20
  • 1
    @mils what might help is the observation that readable streams emit an `end` event when all data has been read. In your case, you can start step 2 once the `end` event of step 1 has fired. That way, you can chain the stream operations. – robertklep Jun 14 '18 at 07:37
  • 3
    2GB max for a Buffer seems too small for today's standard. Does anyone know how to increase this? Why does it have to be so small? Is this a fundamental limitation with Javascript (v8) itself? – Soichi Hayashi Sep 12 '18 at 21:12
  • @SoichiHayashi I'm not sure if typed arrays have a default maximum size. – robertklep Sep 13 '18 at 06:38
-1

u can read the file using line reader node js package and at every 50000 lines you can make small files squenceally then process those file and clear them out for your purpose if you have some task to read data from each line for a bigger file.line reader can do the job as it use stream in backend. the line reader dont wait for you if you directly read and process data like update in mongodb etc. i did it and it worked even for a 10gb file .

  • 1
    As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Jul 22 '22 at 11:55