75

Currently I have a single server in amazon where I put all my cronjobs. I want to eliminate this single point of failure, and expose all my tasks as web services. I'd like to expose the services behind a VPC ELB to a few servers that will run the tasks when called.

Is there some service that Amazon (AWS) offers that can run a reoccurring job (really call a webservice) at scheduled intervals? I'd really like to be able to keep the cron functionality in terms of time/day specification, but farm out the HA of the driver (thing that calls endpoints at the right time) to AWS.

I like how SQS offers web endpoint(s), but from what I can tell you cant schedule them. SWF doesn't seem to be a good fit either.

rynop
  • 50,086
  • 26
  • 101
  • 112

11 Answers11

70

AWS announced support for scheduled functions in Lambda at its 2015 re:Invent conference. With this feature users can execute Lambda functions on a scheduled basis using a cron-like syntax. The Lambda docs show an example of using Python to perform scheduled events.

Currently, the minimum resolution that a scheduled lambda can run at is 1 minute (the same as cron, but not as fine grained as systemd timers).

The Lambder project helps to simplify the use of scheduled functions on Lambda.

λ Gordon's cron example has perhaps the simplest interface for deploying scheduled lambda functions.


Original answer, saved for posterity.

As Eric Hammond and others have stated, there is no native AWS service for scheduled tasks. There are only workarounds and half solutions as mentioned in other answers.

To recap the current options:

  • The single-instance autoscale group that starts and stops on a schedule, as described by Eric Hammond.
  • Using a Simple Workflow Service timer, which is not at all intuitive. This case study mentions that JPL used SWF to build a distributed cron, but there are no implementation details. There is also a reference to a code example buried in the SWF code samples.
  • Run it yourself using something like cronlock.
  • Use something like the Unreliable Town Clock (UTC) to run Lambda functions on a schedule. Remember that Lambda cannot currently access resources within a VPC

Hopefully a better solution will come along soon.

Ben Whaley
  • 32,811
  • 7
  • 87
  • 85
  • Well thank you for providing the updated details. Hopefully Amazon is working on a new service that will solve this problem in the future. – thatidiotguy Jan 20 '15 at 15:23
  • 2
    @thatidiotguy in case you didn't see the announcement, AWS Lambda now addresses this gap. Answer updated accordingly. – Ben Whaley Oct 11 '15 at 02:05
  • 2
    Lambda need not be doing the work, if you used the Lambda to create an SQS that could trigger the job on one of your servers, you could get around the issue of not being able to run Lambda in a VPC. – ThomasRedstone Oct 27 '15 at 08:30
17

Introducing Events in AWS Cloudwatch

You can schedule by minute, hourly, days or using CRON expression using console and without Lambda or any programming.

I just scheduled my ASP.net WEB API(HTTP Post) using SNS HTTP endpoint to execute every minute and it's working perfectly.

enter image description here

Vikash Rathee
  • 1,776
  • 2
  • 25
  • 43
  • This seems great to invoke the Lambda function. – Athar Apr 23 '17 at 12:24
  • I had designed a method called by a Scheduler on a Single Instance Beanstalk. When came the time to move to an ELB, I had to find a way to get the method called only one by the Scheduler. I think this will work: Cron job triggering SNS hitting an endpoint that calls the method that used to be called by the Scheduler! – payne Feb 13 '20 at 01:36
7

Is there some service that Amazon (AWS) offers that can run a reoccurring job at scheduled intervals?

This is one of a few single points of failure that people (including me) keep mentioning when designing architectures with AWS. Until Amazon solves it with a service, here's a hack I've published which is actively used by some companies.

AWS Auto Scaling can run and terminate instances using a recurring schedule specified in the cron format.

http://docs.amazonwebservices.com/AutoScaling/latest/APIReference/API_PutScheduledUpdateGroupAction.html

You can have the instance automatically run a process on startup.

If you don't know how long the job will last, you can set things up so that your job terminates the instance when it has completed.

Here's an article I wrote that walks through exact commands needed to set this up:

Running EC2 Instances on a Recurring Schedule with Auto Scaling
http://alestic.com/2011/11/ec2-schedule-instance

Starting a whole instance just to kick off a set of jobs seems a bit like overkill, but if it's a t1.micro, then it only costs a couple pennies.

That t1.micro doesn't have to do the actual work either. Your instance could inject messages into SQS or through SNS so that the other redundant servers pick up the tasks.

Eric Hammond
  • 22,089
  • 5
  • 66
  • 75
  • This will work for one cron. But it will add complexity and cost when you have several such jobs to run on different intervals. – Wasif Jul 24 '12 at 05:35
  • Yeah I agree with Eric. This is one of the most common problems I run into with AWS they have a lot of good services but this is one thing that is really lacking in AWS. That's not to say that any of the other cloud providers offer a better alternative. – bwight Jul 24 '12 at 14:36
  • Yep I've thought of doing something similar, but its not as robust as I need/want it. Curious, would you AWS power users be willing to pay for a software as a service that solved this problem? I know I could make something that had some additional aws specific integration, just wondering if its worth the investment in time/$. I was thinking the pricing/bus model would be along the lines of pingdom.com – rynop Jul 24 '12 at 14:45
  • @mwasif: If you have 20 cron jobs to run each day, you could stack them up in a single scheduled event so that one instance triggers all of them. If you have a lot of different schedules, then it might be cheaper just to have a full-time instance running. You could use Auto Scaling to make sure the instance is replace if it fails. – Eric Hammond Jul 25 '12 at 00:08
  • @rynop: In what way is this approach not robust for your needs? – Eric Hammond Jul 25 '12 at 00:10
  • @EricHammond if I have crons that need to run on minute intervals. Spinning up an instance, and getting the code deployed that I need will take longer than that. I also want a multi-node cluster that can react to multi zone (or even single region) failure. – rynop Jul 25 '12 at 21:57
  • @EricHammond Also, "Each partial instance-hour consumed will be billed as a full hour." so your approach is actually more expensive than just leaving an instance running. You could have more HA for less cost in a cluster (need more logic obviously, this starts down the road of my software as a service idea for this). – rynop Jul 25 '12 at 22:04
  • @rynop: That's why I suggest you would trigger multiple jobs from a single instance run. I explain earlier in this thread that this isn't appropriate for all types of cron schedules. – Eric Hammond Jul 26 '12 at 21:21
4

This a hosted third party site that can regularly call scheduled scripts on your domain.

This will not work if you need your script to run in the shell, and not as Apache.

Puggan Se
  • 5,738
  • 2
  • 22
  • 48
Travis Austin
  • 439
  • 5
  • 14
3

Sounds like this might be useful to you: http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-using-task-runner.html

Task Runner is a task agent application that polls AWS Data Pipeline for scheduled tasks and executes them on Amazon EC2 instances, Amazon EMR clusters, or other computational resources, reporting status as it does so. Depending on your application, you may choose to:

  • Allow AWS Data Pipeline to install and manage one or more Task Runner applications for you on computational resources that it manages automatically. In this case, you do not need to install or configure Task Runner as described in this section. This is the recommended configuration.

  • Manually install and configure Task Runner on a computational resource such as a long-running EC2 instance or a physical server. To do so, use the procedures in this section.

  • Develop and install a custom task agent instead of Task Runner. The procedures for doing so will depend on the implementation of the custom task agent.

Joe Zack
  • 3,268
  • 2
  • 31
  • 37
  • Unmarking as correct. After finally getting the time to really look into this problem, data pipeline does not seem to solve my specific problem. Its good for setting up non-variable sched. tasks, like backing up dynamo table daily. A system that requires lots (thousands) of different jobs (called activities) run all at different times with different "parameters" would be, IMO, untenable. – rynop Mar 04 '14 at 20:21
3

Amazon has introducted Lambda last year for NodeJS, yesterday Amazon added the features Scheduled Functions, VPC Support, and Python Support.

By leveraging Scheduled Function - a proper replacement for CRON can be attained.

More Info - http://aws.amazon.com/lambda/details/

enter image description here

Naveen Vijay
  • 15,928
  • 7
  • 71
  • 92
3

As of August 2020, Amazon has moved the Lambda/CloudWatch events to a service called EventBridge (https://aws.amazon.com/eventbridge/). It was launched in July 2019, after most of the answers to this question.

Hodgson
  • 76
  • 3
1

Looks like this is a relatively new option from AWS BeanStalk:

https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features-managing-env-tiers.html#worker-periodictasks

Basically, they act like regular SQS receivers, but they're called on a cron schedule instead of in response to a SQS message.

Falcolas
  • 31
  • 2
0

SWF is a Web service from AWS that can be used to schedule tasks. Most of the work goes into specifying what a task and a schedule is.

http://milindparikh.blogspot.com/2015/07/introducing-diksha-aws-lambda-function.html is a scalable scheduler written against SWF.

0

CloudWatch Events are great, but there is a limit on their number. If you need a scale and willing to sacrifice the precision you could use DynamoDB's TTL as a timer.

The idea is to put items into a DynamoDB table with a TTL set to the time you need to run a task. DynamoDB will delete those items somewhere around the specified time (within 48 hours of expiration). Those deleted items will appear in the DynamoDB stream, associated with a table. A lambda function could listen the stream and take appropriate actions upon the deletions.

Read more in "DynamoDB TTL as an ad-hoc scheduling mechanism" by theburningmonk.com.

madhead
  • 31,729
  • 16
  • 153
  • 201
-2

The AWS Elastic Load Balancers will ping your instances to check that they're healthy. You can add your cron-like tasks to the script that the ELB is pinging, and it will execute very regularly.

You'd want to add some logic so that each tasks is executed the right amount of times and at the right interval, but this could be accomplished with a database table that tracks executions. Each time the ELB pings your server, your server would check the database to see if any job is pending, and then execute that job.

The ELB will timeout if the script takes too long to execute, so it's important to not create a situation where your ELB health check will take many seconds to process the cron tasks. To overcome this, you can employ the AWS Simple Notification Service. Your ELB health check script can simply publish a message to an SNS topic, and then that topic can deliver the message via an HTTP request to your web server.

In other words: ELB pings your EC2 instance... EC2 instance checks for pending jobs and sends a message to SNS if any are found... SNS notifies your app via HTTP... The HTTP call from SNS is what actually processes the cron job

Travis Austin
  • 439
  • 5
  • 14
  • 6
    It would be simpler to just setup Cron, Chronos or Quartz on the machine. – eSniff Mar 12 '14 at 20:23
  • 1
    Yeah, good point. Whether ELB or cron is firing regularly is irrelevant. You're right. My answer's way too complicated a solution. :-) – Travis Austin Oct 22 '14 at 17:55