I have tab separated file with several columns (usually 9). File can be several hundreds of megabytes in size, typically just under 1 Gb, which is thousands to millions of lines. Several lines (random number of lines) describe one particular thing. Each line will have some bits of information and I want to collate several lines of information into single object, because it'll be much easier to work with. Here is my initial attempt:
const fs = require('fs');
const events = require("events");
const readline = require('readline');
// get second argument on the command line
var myFile = process.argv[2];
const rl = readline.createInterface({
input: fs.createReadStream(myFile)
});
myObject = {};
rl.on('line', (line) => {
var items = line.split("\t");
if(!(items[0] in myObject)) {
myObject[items[0]]=items[3];
};
});
I learned how to read in large file and I'm sort of getting to understand node.js events thing, but my problem is that random number of lines are grouped together, but the don't have to be sequential lines in the file. So this is some sort of look ahead functionality, but again look ahead might need to be looking through a whole file, which isn't efficient I believe.
After reading node.js events post, very similar to my problem. I came up with this.
var myFile = process.argv[2];
var myEvent = new events.EventEmitter();
const rl = readline.createInterface({
input: fs.createReadStream(myFile)
});
rl.on('line', function(line) {
var items = line.split("\t");
myObject = {
id = items[0],
name = items[2],
other = items[7]
};
myEvent.emit('data', myObject);
});
myEvent.on('data', function(myObject) {
console.log(myObject);
}
I think I'm beginning to understand how rl
instance of readline
class has events thing and .on
event line
you can get every line from a file. And you can then emit
that newly made object onto further processing. By I can't figure out how to manipulate several lines, i.e how to store everything in a single global object.
p.s newbie at node.js and js in general, but really keen to take it up. Any general advice, links or any other help will be much appreciated.