0

I am using the @aws-sdk/client-s3 to read a json file from S3, take the contents and dump it into dynamodb. This all currently works fine using:

const data = await (await new S3Client(region).send(new GetObjectCommand(bucketParams)));

And then deserialising the response body etc.

However, I'm looking to migrate to use jsonlines format, effectiely csv, in the sense it needs to be streamed in line by line or in chunks of lines and processed. I can't seem to find a way of doing this that doesnt load the entire file into memory (using response.text() etc).

Ideally, I would like to pipe the response into a createReadStream, and go from there.

Sheen
  • 586
  • 10
  • 22
  • Already covered here: https://stackoverflow.com/questions/36942442/how-to-get-response-from-s3-getobject-in-node-js – Sheen Feb 03 '23 at 16:25

1 Answers1

0

I found this example with createReadStream() form module fs in node.js:

import fs from 'fs';

function read() {
   let data = '';
   const readStream = fs.createReadStream('business_data.csv', 'utf-8');
   readStream.on('error', (error) => console.log(error.message));
   readStream.on('data', (chunk) => data += chunk);
   readStream.on('end', () => console.log('Reading complete'));
};

read();

You can modify it for your use. Hope this helps.

Connection to your S3 you can do by:

var s3 = new AWS.S3({apiVersion: '2006-03-01'});
var params = {Bucket: 'myBucket', Key: 'myImageFile.jpg'};
var file = require('fs').createWriteStream('/path/to/file.jpg');
s3.getObject(params).createReadStream().pipe(file);

see here

Micha
  • 906
  • 6
  • 9
  • This is what I'll be using to read the file, however what im unsure about is how to make the stream connection to S3 for that. – Sheen Feb 03 '23 at 15:50