1

I am trying to fetch the contents of a csv file from AWS S3 using axios and parse it using csv-parser it, then store the parsed data to my local database. Since the bucket/file is set to public, I'm pretty sure I don't need to include the bucketName, key, access/security keys when performing the request. Right now I was able to fetch the contents of the csv file but it is not parsed. All identifier first, then all values for each record separated with new line, somehting like this:

taskID,status,lastUpdate
1,ongoing,2023-01-01
2,completed,2023-01-02

I tried using csv-parser but I'm not sure what I'm missing and I can't seem to display the error. Here's what I've done so far:

const fs = require("fs")
const csvParser = require("csv-parser")
const axios = require("axios")

async function parseCSVFile(filePath) {
  let parsedData = []

  axios.get(filePath).then(function(response) {
      let csvData = response.data
      console.log(csvData);

      csvParser(csvData, { headers: true })
      .on('data', function(data) {
        console.log('asdasd');
        parsedData.push(data)
      })
      .on('end', function() {
        console.log('CSV data parsed', parsedData);
      })
      .on('error', function() {
        console.log("Error parsing CSV data");
      })
  })
  
}

I tried using fs initially and of course it worked but only because the file I am trying to parse is found locally. My csvData has a value but it is in plain text, but it's not being pushed to parsedData

indigo
  • 23
  • 3

1 Answers1

0

csv-parser works with streams, which are piped to it, so it doesn't take a string as an argument, like in your code, only options object, which is why it doesn't work.

So, turn axios response to a stream:

axios.get(filePath, { responseType: 'stream'})

and then pipe response stream to csv-parser:

response.data.pipe(csvParser({ headers: true }))

Try this:

 axios.get(filePath, { responseType: 'stream'}).then(function(response) {
      let csvData = response.data
      console.log(csvData); // this is a stream now..

      csvData.pipe(csvParser({ headers: true }))
      .on('data', function(data) {
        console.log('asdasd', data);
        parsedData.push(data)
      })
      .on('end', function() {
        console.log('CSV data parsed', parsedData);
      })
      .on('error', function() {
        console.log("Error parsing CSV data");
      })
  })
traynor
  • 5,490
  • 3
  • 13
  • 23
  • I ended up adding a catch after the get method to return an error if the file/url path does not exist or invalid. And for some reason, adding ```headers: true``` when calling ```csvParser``` returns an object containing the (1) array of all the column name then (2) the array of the values (one array for each record) so I have it removed. – indigo Aug 07 '23 at 02:59
  • great. well, you can set options depending on what you're trying to do, check [options](https://www.npmjs.com/package/csv-parser#headers) – traynor Aug 07 '23 at 06:08