2

All,

I have a quite disturbing problem with my Amazon Elastic Beanstalk Worker combined with SQS, which is supposed to provide a cron job scheduling - all this running with PHP.

Following scenario - I need a PHP script to be executed regularly in the background, which might eventually run for hours. I saw this nice introduction which seems to cover exact my scenario (AWS Worker Environments - see the Periodic Task part)

So I read quite a lot of howtos and set up an EBS Worker with the SQS (which actually is done automatically during creation of the worker) and provided the cron config (cron.yaml) within my deployment package.

The cron script is properly recognized. The sqs daemon starts, messages are put into the queue and trigger my PHP script exactly on schedule. The script is run and everything works fine.

The configuration of the queue looks like this: SQS configuration

However after some time of processing (the script is still busy - and NO it is not the next scheduled run^^) a second message is opened and another instance of the same script is executed, and another, and another... in exactly 5 minutes intervals.

I suspect, somehow the message is not removed from the queue (although I ensured that the script sends status 200 back), which ends up in creating new message, if the script runs for too long.

Is there a way to prevent the spawning of another messages? Tell the queue or the sqs daemon not to create new flighing messages? Do I have to remove the message in my code? Although the tutorial states it should happen automatically

I would like to just trigger the script, remove the message from queue and let the script run. No fancy fallback / retry mechanisms please :-)

I spent many hours trying to find something on the internet. Unsuccessful. Any help is appreciated.

Thanks

Jarek
  • 59
  • 6
  • This has been asked a while ago but I had the same problem on Amazon Linux AMI 2 and this article helped me https://dev.to/rizasaputra/understanding-aws-elastic-beanstalk-worker-timeout-42hi – Romain Nov 09 '21 at 10:01

4 Answers4

2

a second message is opened and another instance of the same script is executed, and another, and another... in exactly 5 minutes intervals.

I doubt it is a second message. I believe it is the same message.

If you don't respond 200 OK before the Inactivity Timeout expires, then the message goes back to the queue, and yes, you'll receive it again, because the system assumes you've crashed, and you would want to see it again. That's part of the design.

There's an X-Aws-Sqsd-Receive-Count request header you're receiving that tells you approximately how many times the current message has been delivered. The X-Aws-Sqsd-Msgid request header identifies the unique message.

If you can't ensure that the script will finish before the timeout, then this is not likely an appropriate use case for this service. It sounds like the service is working correctly.

Michael - sqlbot
  • 169,571
  • 25
  • 353
  • 427
  • 1
    Hey Michael, thanks for the response. You're right. It is the same message. I send 200 as described here [link](http://stackoverflow.com/questions/15273570/continue-processing-php-after-sending-http-response) but it doesn't fool the daemon. I will experiment with it a little more. In case that I fail to send 200 fast enough - is there a way to extend the Inactivity Timeout or prevent the queue from spawning the message again? – Jarek Feb 15 '17 at 21:18
  • any solution yet? – leoschet Aug 30 '18 at 22:44
1

I know this doesn't directly answer your question regarding configuration, but I ran into a similar issue - my queue configuration is set exactly like yours, and in my Elastic Beanstalk setup, I've set the Visibility Timeout to 1800 seconds (or half an hour) and Max Retries to 2.

If a job runs for more than a minute, it gets run again and then thrown into the dead letter queue, even though after a 200 OK is returned from the application every time.

After a few hours, I realized that it was the Nginx server that was timing out - checking the Nginx error log yielded that insight. I don't know why Elastic Beanstalk includes a web server in this scenario... You may want to check if EB spawns a web server in front of your application, if all else fails.

treble_maker
  • 193
  • 1
  • 12
0

Look at the Worker Environment documentation for details on the values you can configure. You can configure several different timeout values as well as "Max retries", which if set to 1 will prevent re-sends. However, your Dead Letter Queue will fill up with messages that were actually processed successfully, so that might not be your best option.

Brian
  • 5,300
  • 2
  • 26
  • 32
0

This issue is because of SQS Visibility Timeout.

Michaels's Answer is correct.
The request must need to return 200 response within SQS Visibility Timeout.

SQS Visibility Timeout can be increased from the SQS Queue configuration. You can refer to the AWS documentation for other parameters.