I am trying to read a file from a third party AWS S3 bucket which is in a .gz
format. I need to process the data in the file and upload the file to our own S3 Bucket.
For reading the file, I am creating a readStream from S3.getBucket as shown below:
const fileStream = externalS3.getObject({Bucket: <bucket-name>, Key: <key>}).createReadStream();
For making the code more efficient, I am planning to use the same fileStream
for both processing the contents and uploading to our own S3. I have the code below, which does not upload the file to the internal S3 bucket.
import Stream from "stream";
const uploadStream = fileStream.pipe(new stream.PassThrough());
const readStream = fileStream.pipe(new stream.PassThrough());
await internalS3.upload({Bucket:<bucket-name>, Key: <key>, Body: uploadStream})
.on("httpUploadProgress", progress => {console.log(progress)})
.on("error", error => {console.log(error)})
.promise();
readStream.pipe(createGunzip())
.on("error", err =>{console.log(err)})
.pipe(JSONStream.parse())
.on("data", data => {console.log(data)});
However, the code below successfully uploads the file to the internal s3 bucket.
const uploadStream = fileStream.pipe(new stream.PassThrough());
await internalS3.upload({Bucket:<bucket-name>, Key: <key>, Body: uploadStream})
.on("httpUploadProgress", progress => {console.log(progress)})
.on("error", error => {console.log(error)})
.promise();
What am I doing wrong here ?
NOTE: If I use separate fileStream
s to upload and read data, it works fine. However, I need to achieve this using the same fileStream.