What I'm doing
I am trying to do this:
Launch tasks in a private subnet and make sure you have AWS PrivateLink endpoints configured in your VPC, for the services you need (ECR for image pull authentication, S3 for image layers, and AWS Secrets Manager for secrets).
My understanding of this is that AWS services act as a "VPC Endpoint Service" and all that I need to do is set up a "Interface VPC endpoint" to make my service a "service consumer" as described here: https://docs.aws.amazon.com/vpc/latest/privatelink/vpce-interface.html
I have tried to implement this in CloudFormation, but I have a few questions from reading the documentation.
My Questions
Question 1
The documentation explains how to create the Interface VPC Endpoints, which is great. But it also says: "To turn on private DNS for the interface endpoint, for Enable DNS Name, select the check box." and "This option is turned on by default"
But over here: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-ec2-vpcendpoint.html#cfn-ec2-vpcendpoint-privatednsenabled
It says: "Default: false". Which is it?
Question 2
I need to enable 3 ServiceNames. So... do I need to repeat this 3 times? My YAML which repeats the AWS::EC2::VPCEndpoint 3 times is below. Is this really correct? It seems too long / verbose.
privateVPCEndpoint1:
Type: AWS::EC2::VPCEndpoint
Properties:
ServiceName: !Sub com.amazonaws.${AWS::Region}.ecr.dkr
PrivateDnsEnabled: True
# "If this parameter is not specified, we attach a default policy that allows full access to the service."
# PolicyDocument:
SecurityGroupIds:
- !Ref ECSSecurityGroupDownloadRedisContainer
SubnetIds:
- !Ref privateSubnet1
- !Ref privateSubnet2
VpcEndpointType: Interface
VpcId: !Ref VPC
privateVPCEndpoint2:
Type: AWS::EC2::VPCEndpoint
Properties:
ServiceName: !Sub com.amazonaws.${AWS::Region}.ecr.api
PrivateDnsEnabled: True
SecurityGroupIds:
- !Ref ECSSecurityGroupDownloadRedisContainer
SubnetIds:
- !Ref privateSubnet1
- !Ref privateSubnet2
VpcEndpointType: Interface
VpcId: !Ref VPC
privateVPCEndpoint3:
Type: AWS::EC2::VPCEndpoint
Properties:
ServiceName: !Sub com.amazonaws.${AWS::Region}.ecr.s3
PrivateDnsEnabled: True
SecurityGroupIds:
- !Ref ECSSecurityGroupDownloadRedisContainer
SubnetIds:
- !Ref privateSubnet1
- !Ref privateSubnet2
VpcEndpointType: Interface
VpcId: !Ref VPC
Question 3
For Security group, select the security groups to associate with the endpoint network interfaces.
Do I use the ECSSecurityGroupDownloadRedisContainer security group which is attached to my ECS Service via NetworkConfiguration / AwsvpcConfiguration / SecurityGroups? If yes, do I need to associate both ECSSecurityGroupDownloadRedisContainer (which allows traffic on 443) and ECSSecurityGroupRedis (which allows traffic on 6379)? I assume the answer to this is yes + only ECSSecurityGroupDownloadRedisContainer but I don't really know.
Question 4
Can I somehow disable access to ECS on port 443 after the container has been downloaded? I only need access to 6379 for Redis; anything else seems like a security liability to me.
Background: Why I'm Doing This
I am trying to create a ECS cluster + Service + Task, but I am getting the error:
(CannotPullContainerError: inspect image has been retried 5 time(s): failed to resolve ref "docker.io/library/redis:latest": failed to do request: Head https://registry-1.docker.io/v2/library/redis/manifests/latest: dial tcp 34.231.251.252:443: i/o timeout)
Research has pointed me to this post: Aws ecs fargate ResourceInitializationError: unable to pull secrets or registry auth
With this authoritative answer by an AWS employee nathan peck from March of this year: https://stackoverflow.com/a/66802973
They suggest one of three resolutions:
- Launch tasks into a public subnet, with a public IP address, so that they can communicate to ECR and other backing services using an internet gateway
- Launch tasks in a private subnet that has a VPC routing table configured to route outbound traffic via a NAT gateway in a public subnet. This way the NAT gateway can open a connection to ECR on behalf of the task.
- Launch tasks in a private subnet and make sure you have AWS PrivateLink endpoints configured in your VPC, for the services you need (ECR for image pull authentication, S3 for image layers, and AWS Secrets Manager for secrets).
As you know, redis operates on port 6379, not port 443. My thoughts on these solutions:
- Option 1 is very dangerous! I should NEVER be forced to expose my database instance to the public internet. So that's out.
- Option 2 is what I started to implement, and then I realized that this involved exposing and allowing traffic on port 443 in my subnet + routing table + etc. That seems like an unnecessary security risk when I'm only going to be using port 443 @ container startup.
- Option 3 seems like the right thing to do.
Thus, my journey.