2

Using Node.js, I am trying to upload a large file (700MB ~ 1GB), that i get as a response to a POST request (using request), to my S3 bucket.

Using the aws-sdk for Javascript iv'e tried 2 approaches but each had a different issue:

Approach 1 - Trying to invoke the s3.upload() function on the response event:

const sendRequest = (data) => {
    try {
        return new Promise((resolve, reject) => {
            AWS.config.loadFromPath('./config/awsConfig.json');
            let s3 = new AWS.S3({ params:{ Bucket:'myBucket', Key:'path/to/file.csv' } });

            request({
                method: 'POST',
                uri: 'https://www.example.com/apiEndpoint',
                headers: {
                    host: 'example.com',
                    'content-type': 'application/json'
                },
                body: JSON.stringify(data)
            }).on('response', (response) => { // 1st approach
                if (200 == response.statusCode) {
                    s3.upload({
                        Body: response,
                        ACL: 'public-read',
                        CacheControl: "5184000"
                    }, (err, data) => {  //2 months
                        console.log(err, data);
                    });
                }
            }).on('error', (error) => {
                reject();
            }).on('end', () => {
                resolve();
            });
        });
    } catch (error) {
        throw new Error('Unable to get and upload file');
    }
}

Result: s3.upload() is called once. The file is created in the bucket but has no data in it (zero bytes).

Approach 2 - Trying to invoke the s3.upload() function on the data event:

    const sendRequest = (data) => {
    try {
        return new Promise((resolve, reject) => {
            AWS.config.loadFromPath('./config/awsConfig.json');
            let s3 = new AWS.S3({ params:{ Bucket:'myBucket', Key:'path/to/file.csv' } });

            request({
                method: 'POST',
                uri: 'https://www.example.com/apiEndpoint',
                headers: {
                    host: 'example.com',
                    'content-type': 'application/json'
                },
                body: JSON.stringify(data)
            }).on('data', (data) => { // 2nd approach                    
                    s3.upload({
                        Body: data,
                        ACL: 'public-read',
                        CacheControl: "5184000"
                    }, (err, data) => {  //2 months
                        console.log(err, data);
                    });             
            }).on('error', (error) => {
                reject();
            }).on('end', () => {
                resolve();
            });
        });
    } catch (error) {
        throw new Error('Unable to get and upload file');
    }
}

Result: s3.upload() is called every time there is a data event. The file is created in the bucket but each time the event is emitted, the new data overwrites the old one. In the end there is only the last data that was emitted (7kb ~ 10kb). Also, after resolve() is called, s3.upload() is still being called multiple times.

Notes:

1) The function returns a Promise because my entire process is synchronous.

2) Both approaches are taken from the answers to Stream response from nodejs request to s3 and from Piping from request.js to s3.upload results in a zero byte file

3) A 3rd approach is to stream to a local file on my server and only then upload to s3. I would very much like to avoid that.

Any ideas on how to get it to work?

FireBrand
  • 463
  • 5
  • 16

0 Answers0