2

Is it possible to pull files into S3 automatically (some sort of cron or timer) from FTP without using EC2 or any other server. Is there any way to achieve that using just the S3.

Jimmy
  • 12,087
  • 28
  • 102
  • 192
  • Where would you propose the "cron or timer" would run, if not on a server? – Michael - sqlbot Apr 12 '15 at 16:12
  • I'm not sure, but I know AWS have some features to extend the usefulness of S3 such as hosting websites, so I didn't know if they build in a way to pull in files as opposed to having files pushed to it – Jimmy Apr 12 '15 at 16:13
  • S3 doesn't has capabilities like this. AWS Lambda would may be able to help you, but at the moment even Lambda is missing the feature of cron style scheduling, so you need to trigger it with something. – Adam Ocsvari Apr 12 '15 at 21:25

3 Answers3

4

Yes, you can trigger Lambda function on a schedule by using SNS: setup a time to fire the notification, and wire the message to trigger the Lambda function.

Just to make it clear: this approach eliminates the need to use EC2 completely.

Diaspar
  • 567
  • 1
  • 5
  • 12
2

S3 does not have a built-in mechanism for fetching files from any external source (http, ftp, etc).

Other than the exception that S3 can internally fetch content from one bucket and store it in another bucket, the only way to get data into S3 is to upload it "from" somewhere outside, which could be an EC2 instance, or a server in your own data center, or a Raspberry Pi in your basement at home, but typically this would be some kind of actual server, somewhere.

@AdamOcsvari pointed out in comments that an Amazon Lambda function could provide the container where code to fetch a file and store it in S3 could be executed, but Lambda is a responsive service that reacts to external events. It does not currently provide a mechanism for time-based events, which would again require some kind of server to spawn the Lambda function.

On the other hand, it's a fairly straightforward matter to build an SFTP/FTP server on an EC2 instance that uses S3 as its backing store (via s3fs and proftpd), such that files sent to your FTP server are just automatically stored in S3 and no further copying is required, but that of course requires a server as well, and may not match what you are needing to accomplish at any rate.

Community
  • 1
  • 1
Michael - sqlbot
  • 169,571
  • 25
  • 353
  • 427
1

The problem with using EC2 to schedule this pull is that the EC2 instance becomes a single point of failure. Should the instance fail, you might miss a download.

AWS Lambda now supports schedules and functions can be triggered to fire based on a time schedule.

Bob
  • 11
  • 1