0

I have an ASCII file about 1GB, my task here is to write a fixed text every 1000 lines.

e.g.

Expected Output

text line 1
text line 2
text line 3
...
text line 999
text line 1000
// My fixed text
text line 1001
text line 1002

Attemp 1 - JavaScript heap out of memory

Trying to write one byte at a time, and keep track number of newlines.

Write my fixed text every 1000 lines

let src = `/path/to/src`;
let dst = `/path/to/dst`;

let readStream = fs.createReadStream(src);
let writeStream = fs.createWriteStream(dst);
let newline = 0;

readStream.on('data', chunk => {
    for(let i in chunk) {
        let c = chunk[i];

        // write one byte at a time 
        writeStream.write(Buffer.from(chunk.buffer, i, 1))

        if(c == 10) {
            newline++;

            // write fixed text for every 1000th line
            if(newline && newline % 1000 == 0) {
                writeStream.write("My Fixed Text");
            }
        }
    }
}).on('end', () => {
    console.log('done');
});

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory

Attemp 2 - Output file scrambled and out of order

Same as attemp 1, but writing one byte at a time from my pre-allocated buffer.

let src = `/path/to/src`;
let dst = `/path/to/dst`;

let readStream = fs.createReadStream(src);
let writeStream = fs.createWriteStream(dst);
let newline = 0;
let buffer = Buffer.alloc(1);

readStream.on('data', chunk => {
    for(let i in chunk) {
        let c = chunk[i];

        buffer[0] = c;

        // write one byte at a time from my pre-allocated buffer
        writeStream.write(buffer)

        if(c == 10) {
            newline++;

            // write fixed text for every 1000th line
            if(newline && newline % 1000 == 0) {
                writeStream.write("My Fixed Text");
            }
        }
    }
}).on('end', () => {
    console.log('done');
});

No error, the script took a while to exit after done is printed

However, the output file looks completely scrambled and out of order

L,a'-etLL'7P62 ,,NL,n''he'L A,r'0et,L ,ei2'vrUb32oa0on' L,c27CirU 7' 2- tN Ng7Bha'0atLV7al'0e'pUbLe2,3 goLA0n'0,v',L9lc'1C,LI0Et0-m',UT5,nD1e''NU,On91arNE40r20ilHLVEn''''''''mwRL (3(0-r''NUB o00at S7Eo'7'clUR32 20vowL 64v80vaL 0,u00hNNUE4n,1'aa L 2 AC1p,,NU e2'Cit, 22y F8etLL56Ola-1C'NLS3n '3mrUS34nn1-CL,g5tw7ma'L)'4t60h'L

This task sounds easy and straightforward, but I'm having trouble to do this.

Any help is appreciated.

Neverever
  • 15,890
  • 3
  • 32
  • 50
  • searching suggests the write stream is not getting flushed before closing the file, causing the js engine to run out of heap space. https://grokbase.com/t/gg/nodejs/125e84345w/how-to-flush-a-writestream-before-the-program-is-done-executing (offsite), https://stackoverflow.com/q/48410361/5217142 , https://stackoverflow.com/questions/43643467/how-to-write-incrementally-to-a-text-file-and-flush-output (onsite) and closed issue on github https://github.com/nodejs/node-v0.x-archive/issues/7348 where transform streams are mentioned may provide leads. – traktor Oct 26 '18 at 12:56
  • Ah, transform stream is a good suggestion, thanks for the direction – Neverever Oct 26 '18 at 23:40

0 Answers0