Im trying to batch process the reading of a file and posting to a database. Currently, I am trying to batch it 20 records at a time, as seen below.
Despite the documentBatch.length
check I have put in, it still seems to not be working (the database call inside persistToDB
should be called 5 times, for some reason it's only called once) and console logging documentBatch.length
, it is hitting higher than that limit. I suspect this is due to concurrency issues, however the persistToDB is from an extrnal lib that needs to be called within an async function.
The way I am trying to batch is to pause the stream and resume the stream once the db work is done, however this seems to be having the same issue.
let documentBatch = [];
const processedMetrics = {
succesfullyProcessed: 0,
unsuccesfullyProcessed: 0,
};
rl.on('line', async (line) => {
try {
const document = JSON.parse(line);
documentBatch.push(document);
console.log(documentBatch.length);
if (documentBatch.length === 20) {
rl.pause();
const batchMetrics = await persistToDB(documentBatch);
documentBatch = [];
processedMetrics.succesfullyProcessed +=
batchMetrics.succesfullyProcessed;
processedMetrics.unsuccesfullyProcessed +=
batchMetrics.unsuccesfullyProcessed;
rl.resume();
}
} catch (e) {
logger.error(`Failed to save document ${line}`);
throw e;
}
});