0

I have created an AWS Lambda function for my Python module using the below Terraform code:

resource "aws_lambda_function" "api_lambda" {
  function_name = local.lambda_name
  timeout       = 300
  image_uri     = "${local.account_id}.dkr.ecr.eu-west-1.amazonaws.com/workload-dbt:latest"
  package_type  = "Image"
  architectures = ["x86_64"]
  memory_size   = 1024
  role          = aws_iam_role.api_lambda_role.arn
    
  vpc_config {
    security_group_ids = [aws_security_group.security_group_for_lambda.id]
        subnet_ids         = var.subnet_ids
  }
    
  environment {
    variables = {
       gitlab_username     = var.gitlab_username
       gitlab_access_token = var.gitlab_access_token
    }
  }
}

data "aws_vpc" "selected_vpc" {
  id = var.vpc_id
}


resource "aws_security_group" "security_group_for_lambda" {
  name        = "Security group for lambda"
  description = "Security group for lambda within the vpc"

  vpc_id = var.vpc_id

  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = [data.aws_vpc.selected_vpc.cidr_block]
  }

  ingress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = [data.aws_vpc.selected_vpc.cidr_block]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}


# Lambda Permissions
resource "aws_lambda_permission" "api_gateway_call_async_lambda_permission" {
  statement_id  = "AllowAPIGatewayInvokeLambda"
  action        = "lambda:InvokeFunction"
  principal     = "apigateway.amazonaws.com"
  function_name = aws_lambda_function.api_lambda.function_name
  source_arn    = "${aws_api_gateway_rest_api.rest_api.execution_arn}/*/*"
}

When tested via API Gateway I get the below error:

{"message": "Endpoint request timed out"} 

I also tried increasing the timeout and memory as can be seen in the terraform code. I have also checked that it has been tagged to the correct VPC Id and subnets and also the outbound rule for destination of security group is 0.0.0.0/0.

What else am I missing here?

RushHour
  • 494
  • 6
  • 25
  • Have you actually created the Lambda permission resource? What kind of integration have you defined for API Gateway? What HTTP method? – Marko E Jun 23 '23 at 10:49
  • Ok so I checked the logs and it gives me this error `Failed to connect to gitlab.com port 443 after 129557 ms: Couldn't connect to server` where I am trying to clone a gitlab repo inside `/tmp` directory via Python subprocess library. @MarkoE – RushHour Jun 23 '23 at 10:51
  • Well, that's your answer then. – Marko E Jun 23 '23 at 10:52
  • Yeah, but I can see the `username` and `password` being provided and even the access. But not sure – RushHour Jun 23 '23 at 11:00
  • Your Lambda function is in a VPC. If your GitLab instance is not in the same VPC, the the route tables for your subnets have to have a route to an Internet Gateway/NAT Gateway so they can reach the Internet. – Marko E Jun 23 '23 at 11:01
  • Okay, I can check this. Thanks for suggesting @MarkoE. – RushHour Jun 23 '23 at 11:03
  • But as per the documentation, I can also have a security group for internet access and which I already have. Still, I need a route table? @MarkoE – RushHour Jun 23 '23 at 11:14
  • I have also checked my subnet id, where I can see 4 route tables attached to that ID where one destination is `0.0.0.0/0` and target is `IGW (Internet gateway ID)` – RushHour Jun 23 '23 at 11:18
  • Sure, and what about the security group? What traffic does it allow? – Marko E Jun 23 '23 at 11:23
  • Outbound rule is all for the security group id which is linked to my lambda. @MarkoE – RushHour Jun 23 '23 at 12:52
  • Ok, and what traffic are you allowing? Which ports? – Marko E Jun 23 '23 at 13:11
  • for inbound its 22 and 443 for SG. @MarkoE – RushHour Jun 23 '23 at 13:16
  • The fact that your Lambda function's subnet's default route is the IGW tells me that you have connected the Lambda function to a public subnet of your VPC. It should be a private subnet (see [here](https://stackoverflow.com/questions/52992085/why-cant-an-aws-lambda-function-inside-a-public-subnet-in-a-vpc-connect-to-the)). And you need NAT (or NAT GW), which I presume you already have. – jarmod Jun 23 '23 at 14:31
  • I didn't get your point exactly @jarmod. – RushHour Jun 26 '23 at 09:42
  • A couple of questions first to clarify the situation. Is your Lambda function actually being invoked (do you see CloudWatch Logs for the invocation)? Do those same logs show the Lambda function being timed out? – jarmod Jun 26 '23 at 12:06
  • Yes, I do see that. Also, I have updated the terraform code whatever is relevant w.r.t Lambda. @jarmod. – RushHour Jun 26 '23 at 16:16
  • As suggested earlier, assuming that you genuinely need your Lambda function to run in your VPC, move the Lambda function from public subnet to private subnet and ensure you have IGW in the VPC and NAT in a public subnet. If you don't need your Lambda function to run in your VPC then don't configure it for VPC. – jarmod Jun 26 '23 at 16:25
  • Okay, I will try to check for the private subnet. – RushHour Jun 26 '23 at 18:07
  • @jarmod I didn't get your last point "move the Lambda function from public subnet to private subnet and ensure you have IGW in the VPC and NAT in a public subnet." What I understood from the post you shared was creating a private subnet with NAT ID. I already have IGW in the public subnet. Can you please clear up my doubts? Apologies. I am entirely new to this networking but try my best to understand each and everything. – RushHour Jun 27 '23 at 17:59
  • Does your Lambda function actually need to run in your VPC? If you don't know then it probably doesn't. That will likely fix most of your issues. – jarmod Jun 27 '23 at 19:04
  • @jarmod It has to be deployed inside VPC. Thats the requirement. I was actually trying to confirm whether my understanding of the last comment is correct. If not, then what do you exactly mean by `move the Lambda function from public subnet to private subnet and ensure you have IGW in the VPC and NAT in a public subnet.`? – RushHour Jun 28 '23 at 07:35
  • When you configure a Lambda function, you indicate whether or not it should be connected to a VPC by supplying (or not) a subnet ID. That subnet ID should represent a private subnet, not a public subnet. If that Lambda function makes outbound network requests e.g. to other websites or APIs then it needs a network route to the internet - to do that, the default route in the private subnet should be a NAT (or NAT gateway) in the public subnet and your VPC needs an IGW (which you already have). See [here](https://docs.aws.amazon.com/lambda/latest/dg/configuration-vpc.html#vpc-internet). – jarmod Jun 28 '23 at 11:35
  • See [Three ways to use AWS services from a Lambda in a VPC](https://www.alexdebrie.com/posts/aws-lambda-vpc/). If I understand your scenario, where the Lambda function needs to connect to gitlab.com, you need the [NAT option](https://www.alexdebrie.com/posts/aws-lambda-vpc/#give-your-lambda-function-public-internet-access-with-a-nat-gateway). – jarmod Jun 28 '23 at 11:45
  • Cool. Thanks a lot for this explanation @jarmod. Now it's very much clear to me. I will try to resolve this and use a private subnet and let you know so that you can post this explanation as an answer. – RushHour Jun 28 '23 at 12:55
  • 1
    @jarmod I have used a private subnet with the `NAT` option as you suggested and it worked like a charm. Thanks a lot for bearing with me. So, can you post your comment as an answer so that I will accept that? – RushHour Jul 03 '23 at 06:59
  • Sorry, just caught up with this thread. Glad it's resolved. Will add an answer shortly. – jarmod Jul 11 '23 at 14:32

1 Answers1

1

In your earlier comments, you mentioned:

I have also checked my subnet id, where I can see 4 route tables attached to that ID where one destination is 0.0.0.0/0 and target is IGW (Internet gateway)

The fact that your Lambda function's subnet's default route is the IGW tells me that you have connected the Lambda function to a public subnet of your VPC. Public subnets are, by definition, those subnets whose default route is an IGW.

It should, however, be connected to a private subnet. See Why can't an AWS Lambda function in a public subnet connect to the internet? for why that is. You will also need NAT (or NAT Gateway), which I presume you already have or can easily add.

So, basically, you should connect the Lambda to a private subnet and include a NAT or NAT Gateway in your infrastructure.

Also note that best practices recommend that you configure at least 2 private subnets, each in a different Availability Zone (AZ), and configure the Lambda function to connect to 2+ private subnets. That will give you some resilience to AZ failures.

jarmod
  • 71,565
  • 16
  • 115
  • 122