2

I am really new to JS, and even newer to node.js. So using "traditional" programming paradigms my file looks like this:

var d = require('babyparse');
var fs = require('fs');

var file = fs.readFile('SkuDetail.txt');

d.parse(file);

So this has many problems:

  1. It's not asynchronous
  2. My file is bigger than the default max file size (this one is about 60mb) so it currently breaks (not 100% sure if that's the reason).

My question: how do I load a big file (and this will be significantly bigger than 60mb for future uses) asynchronously, parsing as I get information. Then as a followup, how do I know when everything is completed?

Reenen
  • 128
  • 9

3 Answers3

5

You should create a ReadStream. A common pattern looks like this. You can parse data as it gets available on the data event.

function readFile(filePath, done) {
    var 
        stream = fs.createReadStream(filePath),
        out = '';

    // Make done optional
    done = done || function(err) { if(err) throw err; };

    stream.on('data', function(data) {
        // Parse data
        out += data;
    });

    stream.on('end', function(){
        done(null, out); // All data is read
    });

    stream.on('error', function(err) {
        done(err);
    });
}

You can use the method like:

readFile('SkuDetail.txt', function(err, out) {
    // Handle error
    if(err) throw err;

    // File has been read and parsed
}

If you add the parsed data to the out variable the entire parsed file will be sent to the done callback.

pstenstrm
  • 6,339
  • 5
  • 41
  • 62
  • True, as you are dealing with large files, it is better to use streams. – Risto Novik Jun 11 '15 at 07:49
  • I'm assuming done is the callback function here. Apologies for my n00bness, but how would that function look? function done (err,data) { //.. }? --- this thows error on "done(null,out);" - "undefined is not a function" – Reenen Jun 11 '15 at 08:25
  • Yes, `done` is the callback. I'll update with some usage examples. – pstenstrm Jun 11 '15 at 11:52
0

For the first question since you want to process chunks, Streams might be what you are looking for. @pstenstrm has an example in his answer.

Also, you can check this Node.js documentation link for Streams: https://nodejs.org/api/fs.html#fs_fs_createreadstream_path_options

If you want an brief description and example for Streams check this link: http://www.sitepoint.com/basics-node-js-streams/

You can pass a callback to the fs.readFile function to process the content once the file read is complete. This would answer your second question.

fs.readFile('SkuDetail.txt', function(err, data){
    if(err){
         throw err;
    }
    processFile(data);
});

You can see Get data from fs.readFile for more details.

Also, you could use Promises for cleaner code with other added benefits. Check this link: http://promise-nuggets.github.io/articles/03-power-of-then-sync-processing.html

Community
  • 1
  • 1
Antariksha
  • 111
  • 4
0

It already is asynchronous, javascript is asynchronous no extra effort is needed from your part. Does your code even work though? I think your parse should be inside a callback of read. Otherwise readfile is skipped and file is null.

In normal situations any io code you write will be "skipped" and the code after it which may be more direct will be executed first.

Joe Yahchouchi
  • 627
  • 7
  • 16
  • Yeah that first set of code doesn't work, and it seems because it doesn't successfully load the file. (But I'm still debugging it). – Reenen Jun 11 '15 at 08:18
  • Even if it loads the file, you have to put d.parse inside a callback of var file = fs.readFile('SkuDetail.txt'); because otherwise javascript will not be setting the value of file to anything. – Joe Yahchouchi Jun 11 '15 at 08:47