Provisioned concurrency has minor impact on response time of Lambda function

Question

We are using a serverless architecture along with AWS Lambda and an API gateway. The execution time of the lambda is in the range of a few milliseconds. The final response at the client's end is received in seconds (which is far more than the lambdas execution time even if init duration is also taken into count considering cold-start scenarios).

While debugging this using API gateway logs, there is integration latency in seconds which is making the end-to-end response considerably slow. To remove the init duration or cold-start I have added rules in CloudWatch for periodically calling the lambdas to keep them warm.

The init duration is removed completely and this helped in reducing integration latency as well. There are Lambdas which can not be scheduled as authentication is required for calling them and for this I have added provisioned concurrency of 5.

This Lambda has init duration as well in the logs. Lambda provisioning is another option to get rid of cold-starts but this is not having impact over the time in which Lambda's response is available at API gateway.

I have followed below links to assign provisioned concurrency to Lambdas:

CloudWatch logs of the Lambda to which I have added provisioning to:

Duration: 1331.38 ms Billed Duration: 1332 ms Memory Size: 256 MB Max Memory Used: 130 MB Init Duration: 1174.18 ms

One thing I could notice in API Gateway and Lambda logs is that Request to lambda was sent from API Gateway at 2021-02-15T11:51:36.621+05:30 but, it was received at Lambda at 2021-02-15T11:51:38.535+05:30. There is about 2 seconds of delay in getting the request at Lambda.

AWS X-RAY TRACING

I have enabled AWS X-Ray logging for API gateway and Lambda both and this is what I have by the traces. This request took 595 ms in total but at Postman the response was received in 1558 ms. From where the delay of approximately 1 second is being added to receiving a response from API gateway?

@KaiBurjack It is the first invocation that is extremely slow. Invocations followed are comparatively fast but not to the mark. I am getting integration latency of more than 1 sec at API gateway. — Avani Khabiya, Feb 12 '21 at 16:15
Please give us numbers, data, code and/or logs - at the very least the last lines of the Lambda logs where the durations for inits etc. are logged as well. How large is the lambda function, which language is it written in? How much memory have you provisioned? Please show us some of the logs from the API gateway. Right now this is pretty much guesswork ;-) — Maurice, Feb 12 '21 at 17:11
@Maurice Updated the question with AWS X-Ray tracings. Please have a look. — Avani Khabiya, Feb 18 '21 at 13:16
Are you using any proxy or other weird network device or service (Tor?) that might intercept and inspect the request? So it looks like the time API Gateway took to process the request is about 600ms, which suggests the other roughly 1000ms get added somewhere on the network path. Are you talking to an API on the other end of the world? If you still think it's the API Gateway, you can enable detailed logging on the Stage and show us some of those logs. — Maurice, Feb 18 '21 at 15:14
@Maurice The API gateway and Lambda are located in the same AWS region and are in the same VPC and I am calling this Lamda using Postman. I suspect, this 1000 ms are because of VPC. About third party service, no such service is being used. — Avani Khabiya, Feb 19 '21 at 06:43
@Maurice Can you help me with detailed logging? I mean, I have enabled AWS X-Ray for API Gateway and Lambda to monitor the time taken at every step, also logged time taken by Mongodb query which is executed inside Lambda using `console.time()` and `console.timeEnd()`. But how do I log the time consumed between API Gateway and the end response received at Postman? — Avani Khabiya, Feb 19 '21 at 06:56
I have a similar issue, even with 10 provisioned and only hit the site once, it takes 5 seconds to init. I'm not sure I understand provisioned concurrency if 1 hit to a lambda with 10 provisioned still takes a 5 second init hit. (Java based lambda). — scottseeker, Mar 16 '23 at 14:00

score 1 · Answer 1 · answered Feb 12 '21 at 16:46

I believe the reason is that the provisioned concurrency of 5 is not enough and you still run into cold starts of your Lambda function. This means if the external service is calling your API endpoint (i.e. your Lambda function behind API Gateway), your Lambda function is warm with 5 instances. If we assume your Lambda function can handle 2 requests per second (500ms for each invocation), then you can roughly handle 10 requests per second with your Lambda function. If the external service is making 20 requests per second, AWS Lambda tries to spin up new instances because the existing ones are busy handling requests. This has the consequence that the external service experiences high response times because of cold starts of your function.

Also, consider that the instances of your Lambda function do not live "forever" but are cleaned up after some point. I.e. if you experience many spikes in your traffic patterns, then this can mean that after one spike the instances live like 15 minutes, then AWS Lambda shuts them down to only keep the 5 provisioned ones and if then another spike comes, you'll see the same problem as before.

Please note: This is a very simplified explanation of what's happening behind the scenes - and more a good guess based on your description. It would help if you'd provide some example numbers (e.g. init duration, execution duration, response time) and maybe some example code what you're doing in your Lambda function. Also: which runtime are you using? how does your traffic pattern look like?

Potential solutions

Reduce the cold start time of your Lambda functions -> always a good idea for Lambda functions that are behind an API Gateway
Provision more instances -> only possible up to a certain (soft) limit

The lambdas that have been assigned provisoned concurrency 5 aren't hit from anywhere rather than my local machine. So, provisioned concurrency 5 doesn't seem to be inadequate. — Avani Khabiya, Feb 15 '21 at 09:48
About reducing the cold starts, I have eliminated cold start duration from some of the Lambdas using cron expression that hits them every 5 min to keep them warm. For the Lambdas the trigger can not scheduled to call the Lambda (as authentication is required for this), I have used provisioning, but the impact is not visible. — Avani Khabiya, Feb 15 '21 at 10:11
Updated the question, added tracings of Lambda and API gateway. Please check. — Avani Khabiya, Feb 18 '21 at 13:17

Mithilesh · Answer 2 · 2022-05-15T09:47:35.937

===== Flow-of-Services =====

API GW ⮕ Lambda function(Provisioned)

===== Query =====

You want to understand why there is latency while processing the request from API GW to Lambda function.

===== Time-stamp-of-CW =====

021-02-15T11:51:36.621+05:30 2021-02-15T11:51:38.535+05:3 Lambda duration - Duration: 1331.38 ms Billed Duration: 1332 ms Memory Size: 256 MB Max Memory Used: 130 MB Init Duration: 1174.18 ms

===== Follow-up questions =====

While the request was processed through the API GW to Lambda function, the execution env took 1174.18 ms(1.1s) sections to become active & execute your code in remaining 0.3s which makes it total of 1.4 seconds.

Q. What is the type of processor you are using? Q. Type of API & Endpoint type?

===== K.C =====

You should read AWS DOC for optimizing your Lambda function code.Optimizing static initialization
Lambda won't charge you for the time it takes to initialize your code (e.g. importing stuff) as long as it finishes in about X seconds.1

===== Replication Observation =====

Without provisioned concurrency.

API GW execution time - 286 ms
Initialization -195ms
Invocation - 11ms
Overhead - 0ms

With provisioned concurrency.

API GW execution time - 1.103ms
Initialization - 97ms
Invocation - 1ms
Overhead - 0ms

I'm in US-WEST-2 region and calling the request from 12,575 km away from the Region. I hv a REST API which is configured as 'Regional' endpoint type. Lambda function is running on x86_64 – 64-bit x86 architecture, for x86-based processors.

-- Check if you have optimized Lambda function code. -- To have low latency, you may make use of 'edge' optimized Rest API.An edge-optimized API endpoint is best for geographically distributed clients. API requests are routed to the nearest CloudFront Point of Presence (POP). -- Alway choose the Region which is closest to the high traffic region.

-- References:

new-provisioned-concurrency-for-lambda-functions provisioned-concurrency.html#optimizing-latency

Provisioned concurrency has minor impact on response time of Lambda function

2 Answers2

Potential solutions

-- References: