0

I downloaded a large CSV file that has Food nutrients values for an API I wanted to build, the file is +300MB and has over 6 Million lines. The structure of the CSV file is as follows:

"id","fdc_id","nutrient_id","amount","data_points","derivation_id","min","max","median","footnote","min_year_acquired"
"3639112","344604","1089","0","","75","","","","",""
"3639113","344604","1110","0","","75","","","","",""
... 6M more of these

This is what I've tried, unsuccessful. If I limit the data length to a reasonable number and break early this works, but of course I need to parse the entire file into JSON. I also tried using the csv-parser NPM package and pipe the read-stream, but was also unsuccessful. How should I go about this?

const fs = require('fs');
const readline = require('readline');

(async () => {
    const readStream = fs.createReadStream('food_nutrient.csv');
    const rl = readline.createInterface({
        input: readStream,
        crlfDelay: Infinity
    });

    let data = [], titles;

    for await (const line of rl) {
        const row = line.split(',').map(s=>s.replace(/\W/g, '').trim());
        if (!titles) {
            titles = row;
            continue;
        }
        data.push(Object.fromEntries(titles.map((t, i) => [t, +row[i]||0])));
    }

    // never getting here in this lifetime
    debugger;
    console.log('Done!');

})();
spider monkey
  • 304
  • 1
  • 4
  • 10
  • using a node module should solve your problem. try `https://github.com/Keyang/node-csvtojson` to get the csv as json – SpiritOfDragon May 02 '20 at 14:46
  • @SpiritOfDragon How long should this operation take, for a file this size with that many lines? Tried using this package but it seems stuck for a few minutes now. I also tried a smaller 5mb CSV files of same format and it took about 1second – spider monkey May 02 '20 at 15:01
  • I hope this link should help. https://stackoverflow.com/questions/16617532/large-csv-to-json-object-in-node-js – SpiritOfDragon May 02 '20 at 15:07

0 Answers0