3

I'm using the node package 'csv', and I've run into an issue in the most obvious use case for the package and am not sure how to proceed. Package repo: https://github.com/wdavidw/node-csv

I need to, 1: read in a csv file, and perform a single operator for each line. 2: after the entire csv file has been read, perform an action and write the result to a new csv file.

I'm stuck on part one. Here is what I have after concatenating a bunch of (seemingly inconsistent) examples together.

var fs = require('fs');
var csv = require('csv');
var transform = require('stream-transform');

var outputMap = {};

var baseStream = fs.createReadStream(__dirname + '/locationTaxonomy.csv');


baseStream
 .pipe(csv.parse())
 .pipe(csv.transform(function(record){
  outputMap[record[2]] = record;
  return record; 
 }));

The preceding only gets through the first ~16 lines of the csv file, and then halts. If I pipe baseStream directly to process.stdout, the file is read in completion. Any ideas as to how to accomplish this seemingly trivial task?

Derek
  • 722
  • 1
  • 6
  • 16
  • What's the error when it halts? – brandonscript Jul 22 '15 at 22:27
  • Does it always stop at the same spot? If so, could there be a formatting issue in the csv data? I've never used `csv` before though, but I have used [`fast-csv`](https://github.com/C2FO/fast-csv) many times before on lots of data in production. You might try that module and see if it gets you any further. That may help you narrow down the problem. – mscdex Jul 22 '15 at 22:59

1 Answers1

4

The return statement in the transform stream handler caused the stream to halt. Removing it allowed the complete csv file to be read

var fs = require('fs');
var csv = require('csv');
var transform = require('stream-transform');

var outputMap = {};

var baseStream = fs.createReadStream(__dirname + '/locationTaxonomy.csv');


baseStream
 .pipe(csv.parse())
 .pipe(csv.transform(function(record){
  outputMap[record[2]] = record;
  // return record;  // remove this line 
 }));
Derek
  • 722
  • 1
  • 6
  • 16
  • Had the exact same problem -- parsing stopped after exactly 16 lines (a `console.log()` within the `transform` function was called 16 times) regardless of how I shifted rows around in my data. I had a oneliner like this: `.pipe(transform(row => data.push(row)))` and the implicit `return` from the arrow function was enough to cause the problem. Refactoring to `.pipe(transform(row => { data.push(row); }))` allowed the parser to get all the way through the file. Don't understand why though! – ericsoco Jun 08 '17 at 04:30