11

How do you use request to download contents of a file and directly stream it up to s3 using the aws-sdk for node?

The code below gives me Object #<Request> has no method 'read' which makes it seem like request does not return a readable stream...

var req = require('request');
var s3 = new AWS.S3({params: {Bucket: myBucket, Key: s3Key}});
var imageStream = req.get(url)
    .on('response', function (response) {
      if (200 == response.statusCode) {
        //imageStream should be read()able by now right?
        s3.upload({Body: imageStream, ACL: "public-read", CacheControl: 5184000}, function (err, data) {  //2 months
          console.log(err,data);
        });
      }
    });
});

Per the aws-sdk docs Body needs to be a ReadableStream object.

What am I doing wrong here?

This can be pulled off using the s3-upload-stream module, however I'd prefer to limit my dependencies.

rynop
  • 50,086
  • 26
  • 101
  • 112

3 Answers3

12

Since I had the same problem as @JoshSantangelo (zero byte files on S3) with request@2.60.0 and aws-sdk@2.1.43, let me add an alternative solution using Node's own http module (caveat: simplified code from a real life project and not tested separately):

var http = require('http');

function copyToS3(url, key, callback) {
    http.get(url, function onResponse(res) {
        if (res.statusCode >= 300) {
            return callback(new Error('error ' + res.statusCode + ' retrieving ' + url));
        }
        s3.upload({Key: key, Body: res}, callback);
    })
    .on('error', function onError(err) {
        return callback(err);
    });
}

As far as I can tell, the problem is that request does not fully support the current Node streams API, while aws-sdk depends on it.

References:

d0gb3r7
  • 818
  • 8
  • 19
7

You want to use the response object if you're manually listening for the response stream:

var req = require('request');
var s3 = new AWS.S3({params: {Bucket: myBucket, Key: s3Key}});
var imageStream = req.get(url)
    .on('response', function (response) {
      if (200 == response.statusCode) {
        s3.upload({Body: response, ACL: "public-read", CacheControl: 5184000}, function (err, data) {  //2 months
          console.log(err,data);
        });
      }
    });
});
mscdex
  • 104,356
  • 15
  • 192
  • 153
  • thanks! For others refrence, request's documented way of getting stream is a bit misleading https://github.com/request/request/issues/931 – rynop Jun 18 '15 at 22:01
  • 8
    I'm having this same problem -- this answer is helpful but the file on s3 ends up being zero bytes. Piping the same request to disk results in a valid file. – Josh Santangelo Jul 28 '15 at 01:07
  • 1
    @JoshSantangelo if you're still having that problem, maybe have a look at my alternate solution. – d0gb3r7 Aug 08 '15 at 11:45
  • @JoshSantangelo Probably too late, but it might be that you are using the wrong encoding. Request assumes text data if you do not provide encoding, e.g. image data will be corrupted. Use req.get({ url: url, encoding: null }). Zero bytes seem weird though. – Philiiiiiipp Sep 30 '16 at 10:52
  • Any ideas how to specify the name the file is stores as? – Chet Jun 02 '17 at 23:10
  • @Chet, I think that's the `s3Key` you enter in the `params` object – FireBrand Aug 13 '17 at 10:54
2

As Request has been deprecated, here's a solution utilizing Axios

const AWS = require('aws-sdk');
const axios = require('axios');

const downloadAndUpload = async function(url, fileName) {
  const res = await axios({ url, method: 'GET', responseType: 'stream' });
  const s3 = new AWS.S3(); //Assumes AWS credentials in env vars or AWS config file
  const params = {
    Bucket: IMAGE_BUCKET, 
    Key: fileName,
    Body: res.data,
    ContentType: res.headers['content-type'],
  };
  return s3.upload(params).promise();
}

Note, that the current version of the AWS SDK doesn't throw an exception if the AWS credentials are wrong or missing - the promise simply never resolves.

skoll
  • 2,272
  • 4
  • 30
  • 33
  • I am trying something very similar to this but this is running inside a test file (jest). So code block is enclosed within a `await expect(new Promise((resolve, reject) => {`. I am enclosing your code inside an async function like so `const resp = async (url) => {` and then calling `resp`. The issue is that I end up getting a time-out. Do you have any suggestions? – curious_guy Jul 23 '20 at 00:09
  • 1
    I have noticed, that the AWS sdk doesn't throw an error if the credentials are missing or don't work - it just never responds, so maybe check that first. – skoll Jul 24 '20 at 04:10
  • +1 This was the issue for me as well. I wish I had known this before banging my head for days and trying different ways to fix this. Thanks @skoll – curious_guy Jul 26 '20 at 17:36