0

I'm working with EMR (Elastic MapReduce) on AWS infrastructure and the default way to provide input files (large datasets) for programs is to upload them to an S3 bucket and reference those buckets from within EMR.

Usually I download the datasets to my local,development machine and then upload them to S3, but this is getting harder to do with larger files, as upload speeds are generally much lower than download speeds.

My question is is there a way to download files from the internet (given their URL) directly into S3 so I don't have to download them to my local machine and then manually upload them?

Felipe
  • 11,557
  • 7
  • 56
  • 103
  • 1
    See here, as I think your question is answered. https://stackoverflow.com/questions/19241671/downloading-a-file-from-internet-into-s3-bucket# – Kevin Glynn Mar 14 '18 at 05:17

1 Answers1

2

No. You need an intermediary- typically, an EC2 instance is used, rather than your local machine, for speed.

tedder42
  • 23,519
  • 13
  • 86
  • 102