0

I am lifting up a preprod environment with CloudFormation. I have already an staging environment that works properly. I have an AWS Lambda function that is located inside a VPC and that, at some point, sends a message to an SQS queue.

Currently this lambda function is timing out. In staging it is working correctly. If I configure the wrong sqs queue in staging, I got an access denied exception (the sqs access is configured in the lambda role). If I do the same with my preprod Lambda, it just times out.

I have reviewed the following questions:

Most of them contain answers pointing out the known issue that Lambdas within a VPC have no internet access. I have a NAT gateway properly configured. I even added some debug code to both staging and preprod lambdas to make an http request to https://httpbin.org/get and both succeeded, so the internet access seems not to be the problem.

I am not using VPC SQS Endpoints but just the public SQS URL. The last thing I tried was to use the AWS Reachability Analyzer but I have not figured out yet how to create a path that goes from my VPC to the public sqs url.

Any advice will be appreciated.

taquion
  • 2,667
  • 2
  • 18
  • 29
  • Lambda function needs to be attached to private subnets, and only private subnets. – jarmod Jul 25 '22 at 15:30
  • Yes. The function is already attached to private subnets – taquion Jul 25 '22 at 15:32
  • 1
    Intermittent connectivity is most commonly caused by a Lambda function configured in multiple AZs (high availability best practice) but accidentally configured for both private and public subnets, so it works sometimes and not other times, depending on which AZ Lambda service happens to launch the function in. But, if you've double checked the subnet config then it's not that. – jarmod Jul 25 '22 at 15:36
  • Currently it is always failing (timing out). The lambda is deployed in US East and I have two subnets configured: one for us-east-1a and another for us-east-1a – taquion Jul 25 '22 at 15:41
  • But in anyway @jarmod, if this were related with a wrong subnet configuration, how comes that the lambda is still able to hit https://httpbin.org/get? This is what bugs me more... – taquion Jul 25 '22 at 15:43
  • While this seems unlikely, but can you verify DNS resolution for the SQS endpoint. And double-check that you're actually talking to the SQS public endpoint and you have not accidentally configured your SDK for VPC endpoint to SQS. – jarmod Jul 25 '22 at 15:49
  • Thanks for the suggestion. I have configured both lambdas (staging and preprod) to try to reach my preprod SQS queue. The staging lambda is throwing an access denied exception (as expected as per its lacking of permissions in its role), whereas the preprod lambda is just timing out. The code is the same in both, so it is not an sdk configuration issue (I am using .Net). I do not think it is an DNS resolution problem since staging is actually reaching the queue. I appreciate, however, your help. – taquion Jul 25 '22 at 15:55
  • Have you tried to increase lambda timeout? – ZabielskiGabriel Jul 27 '22 at 08:11

1 Answers1

0

Check if you have VPC endpoint configured for SQS: https://<region>.console.aws.amazon.com/vpc/home?region=<region>#Endpoints:

In this case you have to check endpoint's security group.

Lasek
  • 290
  • 1
  • 3