3

I've got a Lambda that uses the AWS Java SDK.

In this lambda's handler, I've got code that looks like this:

AmazonSQS sqs = AmazonSQSClientBuilder.defaultClient();
sqs.sendMessage( ... )

I'd expect the above lines to be pretty fast, and for most cases, this is what I'm observing.

However on cold starts, this code is taking about 20 seconds to execute. In fact, just the first line, the client builder, is taking about 10 seconds to complete.

Is this the expected performance of the AWS SQS java api's on cold starts?

V Maharajh
  • 9,013
  • 5
  • 30
  • 31
  • Do you mean the cold start of Lambda? – dmigo Oct 16 '19 at 21:09
  • Yes, cold start of the Lambda that this code is the handler for. The API Gateway timeout is 29 seconds for lambdas, and these two lines alone take 20 seconds, so I'm running into timeouts. – V Maharajh Oct 16 '19 at 21:55
  • Is your Lambda in a VPC? – stdunbar Oct 17 '19 at 00:01
  • You might consider divorcing the function called via API Gateway from the function actually sending the message to avoid these timeouts. That said, could you have this lambda invoke your SQS sender lambda function asynchronously? – andrewec Oct 17 '19 at 03:46
  • @stdunbar yes, in a VPC – V Maharajh Oct 17 '19 at 16:12
  • @andrewec, I thought the purpose of SQS was to insulate developers from that need. To be able to quickly enqueue a job for later and return. Therefore I thought I was just using it incorrectly. I tried he Async version of AmazonSQSClientBuilder. The creation of the builder still takes 10 seconds. The `sendMessageAsync` is instant, but it doesn't seem to actually do anything. I don't know if the lambda returning before the sendMessage completes ends up cancelling it. – V Maharajh Oct 17 '19 at 16:14
  • 1
    Sure, but what I recommended isn't really queuing a job; it's just starting one instead. The separation of the SQS call away from the function called from API Gateway alleviates you from the 30 second timeout without actually making the function happen faster. Read here: https://stackoverflow.com/questions/39126606/invoke-aws-lambda-from-another-lambda-asynchronously – andrewec Oct 18 '19 at 18:15
  • 1
    Something that might be a problem: you won't be able to return a response based on the SQS behavior to the API. So if anything goes wrong, you won't have a way to send that back to the user. Here's some other good reading in the Documentation: https://docs.aws.amazon.com/en_pv/lambda/latest/dg/invocation-async.html – andrewec Oct 18 '19 at 18:17
  • Just to update folks. I switched to javascript lambdas a while ago and this is not an issue at all. Turns out the java sdk is a bit under-loved. – V Maharajh Jun 23 '21 at 16:55

1 Answers1

2

You can create a "keep warm" trigger on cloudwatch that calls your lambda every 5-15 minutes to keep it warm. You get a million free calls every month on lambda so it shouldn't really affect you too much. This is how libraries like zappa keep your APIs warm so it is a well established practice.

You can read more here.

Ninad Gaikwad
  • 4,272
  • 2
  • 13
  • 23
  • Thanks this is a decent workaround. Ideally I'd like to see if I can speed up the cold start since putting an item on a queue feels like it shouldn't take 20 seconds. – V Maharajh Oct 17 '19 at 16:17
  • Even with a cold start 20 seconds seems unreasonably long. Can you try printing out something to the console as soon as lambda is invoked to see how long it takes in warm vs cold start? The difference is usually something like 2-3 seconds at most. Maybe your SQS settings are delaying the message from being sent immediately? – Ninad Gaikwad Oct 17 '19 at 16:21
  • Yeah I thought the 20 sec is unreasonable too. The lambda only takes 3 seconds to start executing (unload jar etc). I double checked the SQS settings and it is all defaults, no long polling or delays. – V Maharajh Oct 17 '19 at 17:10
  • What programming language are you using? Have you tried running the code locally? – Ninad Gaikwad Oct 18 '19 at 02:49
  • Java. No, I haven't setup a local deployment for my stack. – V Maharajh Oct 18 '19 at 04:08
  • I have only ever implemented SQS using Python. It adds around a 100 items to my queue in just a few seconds. It might just be a language thing. – Ninad Gaikwad Oct 18 '19 at 04:13
  • 1
    It is indeed. No issue when I use javascript lambdas either. – V Maharajh Jun 23 '21 at 16:55