21

I have multiple lambdas exposed with api gateway using proxy integration. From time to time i'm getting strange errors with status code 502. There is nothing in lambda cloud watch logs. Below i posted api gateway logs for sample request:

(0cbbd9f5-f1bd-11e7-92c0-4d5d3b7d0380) Received response. Integration latency: 231 ms

(0cbbd9f5-f1bd-11e7-92c0-4d5d3b7d0380) Endpoint response body before transformations:
{
    "Message": "An error occurred and the request cannot be processed.",
    "Type": "Service"
}

(0cbbd9f5-f1bd-11e7-92c0-4d5d3b7d0380) Endpoint response headers: 
{
    Connection=keep-alive, 
    x-amzn-RequestId=0cbc9dee-f1bd-11e7-857b-91f7f814692c, 
    x-amzn-ErrorType=ServiceException, 
    Content-Length=86, 
    Date=Fri, 05 Jan 2018 02:06:32 GMT, 
    Content-Type=application/json
}

(0cbbd9f5-f1bd-11e7-92c0-4d5d3b7d0380) Execution failed due to configuration error: Malformed Lambda proxy response

(0cbbd9f5-f1bd-11e7-92c0-4d5d3b7d0380) Method completed with status: 502

Basically it seems that api gateway cannot reach lambda and call to lambda is returning:

(0cbbd9f5-f1bd-11e7-92c0-4d5d3b7d0380) Endpoint response body before transformations:
{
    "Message": "An error occurred and the request cannot be processed.",
    "Type": "Service"
}

Is there any one else experiencing those issues? Only possible fix from my side is to write retry mechanism but from my side it looks rather that i am missing some configuration or it's AWS failure which they should handle.

Joshua
  • 3,615
  • 1
  • 26
  • 32
Pawel
  • 887
  • 1
  • 9
  • 28
  • Usually the AWS API gateway returns HTTP 502 (Bad Gateway) when an exception is not handled by the function(proxy mode). There's a message in the log: "Execution failed due to configuration error: Malformed Lambda proxy response", that means that for some reason your Lambda function didn't return the response in the expected format. Try to log the entire execution of your lambda functions to find out whats wrong. – Tom Melo Jan 06 '18 at 23:35
  • @TomMelo Thanks for your response Tom! As i wrote above call to lambda returns: " Endpoint response body before transformations: { "Message": "An error occurred and the request cannot be processed.", "Type": "Service" }" Which is later mapped to "Execution failed due to configuration error: Malformed Lambda proxy response by api gateway". I have entire lambda function surrounded by try/catch block so there is no way it comes from my code. What's more aws cloud watch is empty for that request (no start/finished logs as usual) so it doesn't even reach aws lambda. – Pawel Jan 08 '18 at 08:02
  • That response is from Lambda to API Gateway. The recommendation is to retry any 5xx errors from the client side to improve reliability. Your best bet to resolve this issue is to open a support ticket with AWS. – Abhigna Nagaraja Jan 10 '18 at 00:41
  • Already did that. No response so far https://forums.aws.amazon.com/thread.jspa?messageID=719917 i don't have commercial support plan so aws forum is all i can try. – Pawel Jan 10 '18 at 06:28
  • I've seen the same behavior in a number of instances over the past 1 year + for our production apps. It's totally random. It seems like API Gateway didn't get any response from the Lambda (or not the response it was expecting), totally at random, and barfs. But after a matter of seconds to up to a minute, it will recover and pretend everything's fine. – Joshua May 10 '18 at 03:44
  • Does it occur for all the Lambdas or for any specific Lambda? – Mukund May 10 '18 at 09:14
  • Hope this thread should help https://aws.amazon.com/premiumsupport/knowledge-center/malformed-502-api-gateway/ The Lambda is expected to respond in the following format { "isBase64Encoded": true|false, "statusCode": httpStatusCode, "headers": { "headerName": "headerValue", ... }, "body": "..." } – Mukund May 10 '18 at 09:16
  • Is lambda part of VPC? – raevilman May 10 '18 at 17:13
  • yes it's part of VPC – Pawel May 11 '18 at 11:10
  • if the Lambda is inside a VPC, then all the Lambdas will get launched in a private subnet. Are you accessing any of the other AWS services from Lambda, If so, the Lambda will try to access your service via Internet. From a private subnet, it will not be able to access most of the services(except those who have a vpc endpoint) . To enable that, your Lambda role should have permissions to create/attach/delete ENI and the traffic should be routed out via a NAT gateway in your network!!. Otherwise, it will result in Lambda time out and API gateway will complain such errors. Hope this helps – Mukund May 11 '18 at 11:24
  • it's connecting to services in the same vpc, it has access and it works most of the time. Just getting 502 occasionaly. – Pawel May 11 '18 at 11:58
  • Another thing to check is that ALL lambda responses must be a valid HTTP response, check this out https://stackoverflow.com/questions/43708017/aws-lambda-api-gateway-error-malformed-lambda-proxy-response/43718963#43718963 - do any of your code paths result in an invalid one? – Mrk Fldig May 14 '18 at 22:06
  • nope, lambda is not executed at all. – Pawel May 15 '18 at 07:07

1 Answers1

10

I'm listing here one possible reason...

When an AWS Lambda is configured to run in VPC. It takes one IP per execution from VPC.

And if VPC doesn't much free IPs then your lambda will fail silently :(

I've personally faced issues in regards to limited IP, increasing the IPs solved the issue.

Below text from this link

The subnets you specify should have sufficient available IP addresses to match the number of ENIs.

We also recommend that you specify at least one subnet in each Availability Zone in your Lambda function configuration. By specifying subnets in each of the Availability Zones, your Lambda function can run in another Availability Zone if one goes down or runs out of IP addresses.

Note

If your VPC does not have sufficient ENIs or subnet IPs, your Lambda function will not scale as requests increase, and you will see an increase in function failures. AWS Lambda currently does not log errors to CloudWatch Logs that are caused by insufficient ENIs or IP addresses. If you see an increase in errors without corresponding CloudWatch Logs, you can invoke the Lambda function synchronously to get the error responses (for example, test your Lambda function in the AWS Lambda console because the console invokes your Lambda function synchronously and displays errors).

raevilman
  • 3,169
  • 2
  • 17
  • 29
  • 1
    UPDATE: It seems it will not longer be a problem, since recently Amazon announced really nice changes in the way the lambda uses the elastic network interfaces: https://aws.amazon.com/es/blogs/compute/announcing-improved-vpc-networking-for-aws-lambda-functions/ – Jorge Valvert Nov 02 '19 at 05:04