I downloaded a large CSV file that has Food nutrients values for an API I wanted to build, the file is +300MB and has over 6 Million lines. The structure of the CSV file is as follows:
"id","fdc_id","nutrient_id","amount","data_points","derivation_id","min","max","median","footnote","min_year_acquired"
"3639112","344604","1089","0","","75","","","","",""
"3639113","344604","1110","0","","75","","","","",""
... 6M more of these
This is what I've tried, unsuccessful. If I limit the data length to a reasonable number and break early this works, but of course I need to parse the entire file into JSON. I also tried using the csv-parser
NPM package and pipe the read-stream, but was also unsuccessful.
How should I go about this?
const fs = require('fs');
const readline = require('readline');
(async () => {
const readStream = fs.createReadStream('food_nutrient.csv');
const rl = readline.createInterface({
input: readStream,
crlfDelay: Infinity
});
let data = [], titles;
for await (const line of rl) {
const row = line.split(',').map(s=>s.replace(/\W/g, '').trim());
if (!titles) {
titles = row;
continue;
}
data.push(Object.fromEntries(titles.map((t, i) => [t, +row[i]||0])));
}
// never getting here in this lifetime
debugger;
console.log('Done!');
})();