2

I've been struggling for 6 days just to find a solution which can give me buffered output of a given large file. The file can be text, binary, etc.

The code below gives a buffered output of only small files. Providing large files makes it crash.

var fs = require('fs')

var stream = fs.createReadStream('big-file')

stream.on('data', function (data) {
    console.log(data)
})

I also tried using 'byline' library

var fs = require('fs')
var byline = require('byline')

var stream = fs.createReadStream('big-file')

var buffered_stream = byline.createStream(stream, { keepEmptyLines : true )

buffered_stream.on('data', function (data) {
    console.log(data)
})

It gives the buffered output of each line but the bytes are corrupted because extended ASCII characters are multi-bye characters.

If someone can, please help me. Thanks in advance.

z3ron3
  • 29
  • 5
  • Sorry for the awful formatting, but can you try something like this and see if there's something wrong with the file? `/* This will wait until we know the readable stream is actually valid before piping */ readStream.on('open', function () { /* This just pipes the read stream to the response object (which goes to the client) */ readStream.pipe(res); }); /* This catches any errors that happen while creating the readable stream (usually invalid names) */ readStream.on('error', function(err) { res.end(err); });` – hcs Jul 31 '18 at 16:13
  • Possible duplicate of https://stackoverflow.com/questions/6156501/read-a-file-one-line-at-a-time-in-node-js?rq=1 – HugoTeixeira Jul 31 '18 at 21:41

3 Answers3

1

As you can see here the default highWatermark property of the stream is equivalent to a maximal file size of 16kb. It does not mean you should change its size, but to concat the buffers as shown:

const fs = require('fs');

var stream = fs.createReadStream(BIG_FILE);

let buffers = [];;
stream.on('data', (chunk) => {
    const buf = Buffer.from(chunk)
    buffers.push(chunk);
})

stream.on('end', () => {
    console.log(Buffer.concat(buffers));
})
bombyx mori
  • 401
  • 3
  • 4
  • bombyx mori, I tried using your code but it has the same problem which my code has. Hangs on big files. I tried it on a 700 mb exe. This happens because the size of the array which you specified keeps on increasing as read stream forwards data. Anyways, I got the answer. Its posted below. Thanks for your consideration. – z3ron3 Aug 05 '18 at 06:50
0

Funny but I got the answer the next minute I posted this question. Sorry for the delay in update. The code below outputs buffer of each line of the specified file (big or small).

var fs = require('fs')
var lazy = require('lazy')

var readStream = fs.createReadStream('FILE')

new lazy(readStream)
    .lines
    .forEach(function(line) {
        console.log(line) // Buffer
    })
z3ron3
  • 29
  • 5
0

You can use the builtin readline module as suggested here.

In addition, Node.js 10 added support for asynchronously iterating over readable streams. However async-iterators not yet have support for readline (ref).

Example for async-iterators from Node docs:

const fs = require('fs');

async function print(readable) {
  readable.setEncoding('utf8');
  let data = '';
  for await (const k of readable) {
    data += k;
  }
  console.log(data);
}

print(fs.createReadStream('file')).catch(console.error);
Idan Dagan
  • 10,273
  • 5
  • 34
  • 41