1

I have a container running in an EC2 instance on ECS. The container is hosting a django based application that utilizes S3 and RDS for its file storage and db needs respectively. I have appropriately configured my VPC, Subnets, VPC endpoints, Internet Gateway, roles, security groups, and other parameters such that I am able to host the site, connect to the RDS instance, and I can even access the site.

The issue is with the connection to S3. When I try to run the command python manage.py collectstatic --no-input which should upload/update any new/modified files to S3 as part of the application set up the program hangs and will not continue. No files are transferred to the already set up S3 bucket.

Details of the set up:

All of the below is hosted on AWS Gov Cloud

VPC and Subnets

  • 1 VPC located in Gov Cloud East with 2 availability zones (AZ) and one private and public subnet in each AZ (4 total subnets)
  • The 3 default routing tables (1 for each private subnet, and 1 for the two public subnets together)
  • DNS hostnames and DNS resolution are both enabled

VPC Endpoints

All endpoints have the "vpce-sg" security group attached and are associated to the above vpc

  • s3 gateway endpoint (set up to use the two private subnet routing tables)
  • ecr-api interface endpoint
  • ecr-dkr interface endpoint
  • ecs-agetn interface endpoint
  • ecs interface endpoint
  • ecs-telemetry interface endpoint
  • logs interface endpoint
  • rds interface endpoint

Security Groups

  • Elastic Load Balancer Security Group (elb-sg)

    • Used for the elastic load balancer
    • Only allows inbound traffic from my local IP
    • No outbound restrictions
  • ECS Security Group (ecs-sg)

    • Used for the EC2 instance in ECS
    • Allows all traffic from the elb-sg
    • Allows http:80, https:443 from vpce-sg for s3
    • Allows postgresql:5432 from vpce-sg for rds
    • No outbound restrictions
  • VPC Endpoints Security Group (vpce-sg)

    • Used for all vpc endpoints
    • Allows http:80, https:443 from ecs-sg for s3
    • Allows postgresql:5432 from ecs-sg for rds
    • No outbound restrictions

Elastic Load Balancer

  • Set up to use an Amazon Certificate https connection with a domain managed by GoDaddy since Gov Cloud route53 does not allow public hosted zones
  • Listener on http permanently redirects to https

Roles

  • ecsInstanceRole (Used for the EC2 instance on ECS)

    • Attached policies: AmazonS3FullAccess, AmazonEC2ContainerServiceforEC2Role, AmazonRDSFullAccess
    • Trust relationships: ec2.amazonaws.com
  • ecsTaskExecutionRole (Used for executionRole in task definition)

    • Attached policies: AmazonECSTaskExecutionRolePolicy
    • Trust relationships: ec2.amazonaws.com, ecs-tasks.amazonaws.com
  • ecsRunTaskRole (Used for taskRole in task definition)

    • Attached policies: AmazonS3FullAccess, CloudWatchLogsFullAccess, AmazonRDSFullAccess
    • Trust relationships: ec2.amazonaws.com, ecs-tasks.amazonaws.com

S3 Bucket

  • Standard bucket set up in the same Gov Cloud region as everything else

Trouble Shooting

If I bypass the connection to s3 the application successfully launches and I can connect to the website, but since static files are supposed to be hosted on s3 there is less formatting and images are missing.

Using a bastion instance I was able to ssh into the EC2 instance running the container and successfully test my connection to s3 from there using aws s3 ls s3://BUCKET_NAME

If I connect to a shell within the application container itself and I try to connect to the bucket using...

s3 = boto3.resource('s3')
bucket = s3.Bucket(BUCKET_NAME)
s3.meta.client.head_bucket(Bucket=bucket.name)

I receive a timeout error...

File "/.venv/lib/python3.9/site-packages/urllib3/connection.py", line 179, in _new_conn
    raise ConnectTimeoutError(
urllib3.exceptions.ConnectTimeoutError: (<botocore.awsrequest.AWSHTTPSConnection object at 0x7f3da4467190>, 'Connection to BUCKET_NAME.s3.amazonaws.com timed out. (connect timeout=60)')
...
File "/.venv/lib/python3.9/site-packages/botocore/httpsession.py", line 418, in send
    raise ConnectTimeoutError(endpoint_url=request.url, error=e)
botocore.exceptions.ConnectTimeoutError: Connect timeout on endpoint URL: "https://BUCKET_NAME.s3.amazonaws.com/"

Based on this article I think this may have something to do with the fact that I am using the GoDaddy DNS servers which may be preventing proper URL resolution for S3.

If you're using the Amazon DNS servers, you must enable both DNS hostnames and DNS resolution for your VPC. If you're using your own DNS server, ensure that requests to Amazon S3 resolve correctly to the IP addresses maintained by AWS.

I am unsure of how to ensure that requests to Amazon S3 resolve correctly to the IP address maintained by AWS. Perhaps I need to set up another private DNS on route53?

I have tried a very similar set up for this application in AWS non-Gov Cloud using route53 public DNS instead of GoDaddy and there is no issue connecting to S3.

Please let me know if there is any other information I can provide to help.

Jacob Rita
  • 31
  • 3

1 Answers1

0

AWS Region

The issue lies within how boto3 handles different aws regions. This may be unique to usage on AWS GovCloud. Originally I did not have a region configured for S3, but according to the docs an optional environment variable named AWS_S3_REGION_NAME can be set.

AWS_S3_REGION_NAME (optional: default is None) Name of the AWS S3 region to use (eg. eu-west-1)

I reached this conclusion thanks to a stackoverflow answer I was using to try to manually connect to s3 via boto3. I noticed that they included an argument for region_name when creating the session, which alerted me to make sure I had appropriately set the region in my app.settings and environment variables.

If anyone has some background on why this needs to be set for GovCloud functionality but apparently not for commercial, I would be interested to know.

Signature Version

I also had to specify the AWS_S3_SIGNATURE_VERSION in app.settings so boto3 knew to use version 4 of the signature. According to the docs

As of boto3 version 1.13.21 the default signature version used for generating presigned urls is still v2. To be able to access your s3 objects in all regions through presigned urls, explicitly set this to s3v4. Set this to use an alternate version such as s3. Note that only certain regions support the legacy s3 (also known as v2) version.

Some additional information in this stackoverflow response details that new S3 regions deployed after January 2014 will only support signature version 4. AWS docs notice

Apparently GovCloud is in this group of newly deployed regions.

If you do not specify this calls to the s3 bucket for static files, such as js scripts, during operation of the web application will receiving a 400 response. S3 responds with the error message

<Code>InvalidRequest</Code>
<Message>The authorization mechanism you have provided is not supported. Please use AWS4-HMAC-SHA256.</Message>
<RequestId>#########</RequestId>
<HostId>##########</HostId>
</Error>```
Jacob Rita
  • 31
  • 3