0

I'm trying to use 'csv-parser' module from node.js environment. I successfully read my csv document and I got an array 'results' with all the information I need in .json format. The first console.log print all the data, but the second one print an empty array (as declared). Why am I having this scope problem and how can I fix it? Thanks all of you in advance.

const csv = require('csv-parser');
const fs = require('fs');
let results = [];
fs.createReadStream('MyData.csv')
    .pipe(csv())
    .on('data', data => results.push(data))
    .on('end', () => {
        console.log(results) //1st console.log
        console.log('CSV file successfully processed');
    });


console.log(results);//2nd console.log
Heretic Monkey
  • 11,687
  • 7
  • 53
  • 122
alfer0801
  • 1
  • 1
  • 1
  • @VLAZ - That's not a 100% match here because that is about returning an asycnhronous value from a function which isn't an exact match for what's going on here. Still useful to read for further understanding. – jfriend00 Nov 17 '19 at 01:01

1 Answers1

1

It's not a scope problem, it's a timing problem. Your second console.log(results) is executing BEFORE there's any data in it.

Your stream and csv() module are non-blocking and asynchronous. That means that you start them and they work at their own speed somewhat in the background, firing events every so often to do more work. Meanwhile, the rest of your code continues to run. That means your last

console.log(results)

runs LONG before the stream and .pipe(csv()) are done, thus it's empty.

To code with asynchronous operations in node.js, you have to USE the result INSIDE the callback that signifies the completion of the event. So you have to use the result right where your first console.log(results) is.

const csv = require('csv-parser');
const fs = require('fs');
let results = [];
fs.createReadStream('MyData.csv')
    .pipe(csv())
    .on('data', data => results.push(data))
    .on('end', () => {
        console.log(results) //1st console.log
        console.log('CSV file successfully processed');
        // use results HERE
    });

  // can't use results here

Here's a more trivial example that you can actually run right here:

console.log("1");

setTimeout(function() {
    console.log("2");
}, 100);

console.log("3");

This outputs

1
3
2

That's because setTimeout() is also non-blocking and asynchronous. Calling setTimeout() just starts the timer and then the rest of your code keeps running. that's why 3 outputs before 2.

jfriend00
  • 683,504
  • 96
  • 985
  • 979