2

I am trying to use boto3 in python3.6 to connect to my Redshift cluster using the get_cluster_credentials API. The following code times out 100% of the time when the Lambda function is added to the VPC. It runs without issue when Lambda is not added to the VPC.

I can't figure out if get_cluster_credentials uses the public or private IP to access Redshift. I also can't figure out if there is a way to force it to use one or the other.

import json
import boto3

def lambda_handler(event, context):
    redshiftClient = boto3.client('redshift', region_name='us-east-1')
    cluster_creds = redshiftClient.get_cluster_credentials( DbUser='awsuser',
                                                            DbName='dev',
                                                            ClusterIdentifier='redshift-cluster-1',
                                                            AutoCreate=False)
    print(cluster_creds)

    return {
        'statusCode': 200,
        'body': json.dumps('Hello from Lambda!')
    }

My configuration is very simple. The NACL lets everything (0.0.0.0/0) through on all ports and protocols. MY SG does the same thing.

I have 1 internet gateway defined: igw-0d1e6dcbfdea792b2

I have 1 subnet and 1 routing table in the VPC. The routing table has one rule to map 0.0.0.0/0 --> igw-0d1e6dcbfdea792b2.

I am able to connect from outside AWS to the cluster using SQL Workbench/J without issue.

I have looked at many posts, threads and documents, but cannot figure out what is happening:

AWS Lambda times out connecting to RedShift

Connect Lambda to Redshift in Different Availability Zones

https://github.com/awslabs/aws-lambda-redshift-loader/issues/86

Accessing Redshift from Lambda - Avoiding the 0.0.0.0/0 Security Group

https://aws.amazon.com/blogs/big-data/a-zero-administration-amazon-redshift-database-loader/

Conecting AWS Lambda to Redshift - Times out after 60 seconds

Please help.

Thanks a lot.

Garet Jax
  • 1,091
  • 3
  • 17
  • 37

2 Answers2

4

As per your other question, when an AWS Lambda function is added to a VPC, it does not receive a Public IP address. Therefore, if the function wishes to access the Internet (in this case to make the get_cluster_credentials() call), you should:

  • Add a NAT Gateway in a Public subnet
  • Attach the Lambda function to a Private subnet
  • Set routing on the private subnet to use the NAT Gateway for 0.0.0.0/0

It will not work if you have only one subnet, since the Lambda function will not be able to access the NAT Gateway.

I have also had success manually assigning an Elastic IP address to the Lambda function's ENI (instead of using a NAT Gateway), but this will not scale because Lambda might deploy additional containers and therefore additional ENIs. It might be sufficient if the function runs rarely and never concurrently.

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
  • Thanks for continuing to try and help me. I am still missing something. How do I add a NAT Gateway in a Public subnet? Since it's public, it already routes 0.0.0.0/0 --> Internet Gateway. How can I also route 0.0.0.0/0 --> NAT Gateway? – Garet Jax Feb 07 '19 at 04:53
  • Apologies if the terminology is unclear. You will need to **create** a NAT Gateway, which requires a Subnet and an Elastic IP address. Specify a Public subnet and it will be created in the public subnet, which allows it to talk to the Internet Gateway. Configure the private subnet's route table to send `0.0.0.0/0` requests to the NAT Gateway. Then, resources like the Lambda function that are in the private subnet will have their requests redirected to the NAT Gateway, which will send the requests to the Internet via the Internet Gateway. – John Rotenstein Feb 07 '19 at 05:02
  • May have answered my own question. I created a new route table with one route (0.0.0.0/0 --> NAT Gateway). I created two new subnets both using the new route table. I enabled "Auto assign public IPv4 address" on one of the new subnets. I assigned the other new subnet to the Lambda function. Is this the correct way? – Garet Jax Feb 07 '19 at 05:02
  • No. You need a public subnet (0.0.0.0/0 --> IGW) and a private subnet (0.0.0.0/0 --> NAT Gateway). The fact that 0.0.0.0/0 --> IGW makes it a "public" subnet. The other doesn't have it, so it is a "private" subnet. They need separate route tables. – John Rotenstein Feb 07 '19 at 05:03
  • There's a helpful how-to on the AWS documentation page on how to do all of the above https://aws.amazon.com/premiumsupport/knowledge-center/internet-access-lambda-function/ – Matti Lyra Nov 24 '21 at 10:08
0

You should be able to connect to RedShift directly from the VPC without an Internet or NAT gateway. This is what AWS PrivateLink is for and RedShift is supported.

A generic description of the process (service specific variations apply):

  • Go to VPC -> Endpoints in AWS console
  • Create a new endpoint
  • Select which service you want to create the endpoint for
    • configure endpoint security group etc.

Now, in your code when you create the client, you need to define the region and the endpoint for the client.

Disclaimer: I've not done this for RedShift, but I have done it for STS and it works.

Matti Lyra
  • 12,828
  • 8
  • 49
  • 67