3

I have an AWS lambda function in a VPC on AWS account A that has a peering connection with a VPC on AWS account B containing a DAX cluster. I'm getting the following error when trying to connect to the DAX cluster from my lambda.

2021-12-17T17:29:34.096Z    279f4ed8-a6ea-4f50-b1d7-31c307cc3f30    ERROR   Failed to pull from my-cluster.v3fh7d.dax-clusters.us-east-1.amazonaws.com (11.0.225.143): TimeoutError: ConnectionException: Connection timeout after 10000ms
    at SocketTubePool.alloc (/var/task/node_modules/amazon-dax-client/src/Tube.js:244:64)
    at /var/task/node_modules/amazon-dax-client/generated-src/Operations.js:215:30 {
  time: 1639762164096,
  code: 'ConnectionException',
  retryable: true,
  requestId: null,
  statusCode: -1,
  _tubeInvalid: false,
  waitForRecoveryBeforeRetrying: false
}

The relevant part of my lambda code is here.

let assumedRole;

const sts = new AWS.STS({ region: "us-east-1" });
const params = {
  RoleArn:
    "arn:aws:iam::<account-b>:role/role-to-access-dax",
  RoleSessionName: "testAssumeRoleSession" + Date.now().toString(),
  DurationSeconds: 3600,
};

try {
  assumedRole = await sts.assumeRole(params).promise();
} catch (error) {
  console.log("Failed getting sts assume role: " + error);
}

const dax = new AmazonDaxClient({
  endpoint:
    "dax://my-cluster.v3fh7d.dax-clusters.us-east-1.amazonaws.com",
  region: "us-east-1",
  accessKeyId: assumedRole.Credentials.AccessKeyId,
  secretAccessKey: assumedRole.Credentials.SecretAccessKey,
  sessionToken: assumedRole.Credentials.SessionToken,
  httpOptions: { timeout: 150000 },
  maxRetries: 1,
});

const dynamodb = new AWS.DynamoDB.DocumentClient({ service: dax });

try {
  const params = {
    Key: {
      userid: requestData.userid,
    },
    TableName: "my-users-table",
  };
  const result = await dynamodb.get(params).promise();

  if (result.Item == undefined || result.Item == null) {
    return createResponse(401, "Unauthorized");
  }
  return createResponse(200, JSON.stringify(result.Item));
} catch (error) {
  return createResponse(500, error);
}

The role arn:aws:iam::<account-b>:role/role-to-access-dax has the following permissions

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "dax:GetItem",
                "dax:BatchGetItem",
                "dax:Query",
                "dax:Scan",
                "dax:PutItem",
                "dax:UpdateItem",
                "dax:DeleteItem",
                "dax:BatchWriteItem",
                "dax:ConditionCheckItem"
            ],
            "Resource": "arn:aws:dax:us-east-1:<account-b>:cache/my-cluster"
        }
    ]
}

and the following trust relationship.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::<account-a>:root"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

The DAX cluster has the policy AmazonDynamoDBFullAccess.

The peering connection shows up as Active in the AWS console.

The DAX cluster's security group has an inbound rule to allow TCP traffic on port 8111 from source <account-a> / <sg-of-lambda>.

The CIDR of the Account A VPC is 10.0.0.0/24 and the CIDR of the Account B VPC is 11.0.0.0/16.

The Account A VPC's main route table has a route directing traffic with destination 11.0.0.0/16 to the peering connection. Likewise, the Account B VPC's main route table has a route directing traffic with destination 10.0.0.0/24 to the peering connection.

As an aside, the following lines in the lambda code appear to be ignored as there are quite a few retries on the DAX request and the timeout is not changing from 10000 ms.

  httpOptions: { timeout: 150000 },
  maxRetries: 1,
harindoo
  • 51
  • 5
  • 1
    Did you check the connection from instance or any other way to confirm that the issue is only with the lambda? – Marcin Dec 17 '21 at 22:53
  • Thanks for the suggestion! I haven't tried that yet. Will do and get back to you – harindoo Dec 20 '21 at 17:19

1 Answers1

2

I was able to solve this issue with the help of an AWS rep. It turns out I needed a public and private subnet in my VPC containing the lambda. The lambda itself had to be in a private subnet with the public subnet containing a NAT gateway and an internet gateway. Instead of a single route table in the VPC, I needed separate route tables for the two subnets. The private one contains the peering connection route and VPC CIDR route like I mentioned in my question but also contains a route with destination 0.0.0.0/0 with the NAT gateway as the target. The public subnet route table contains the VPC CIDR route as well as a route with destination 0.0.0.0/0 with the internet gateway as the target.

harindoo
  • 51
  • 5
  • Just to simplify this answer, creating a new VPC with "VPC and more" option, will create all the settings you need (mentioned above like NAT, IG, private and public subnets) by default. After that, you need to 1) Add inbound rule in the security group for TCP 8111 and/or 9111 2) Update your lambda function to use this new VPC and only private subnets to run on. – Tyr1on Jun 03 '22 at 15:46