I'm trying to create a DataSync task to copy files from EFS to S3, and for this I'm using Terraform. From reading the documentation, it looks like I dont need DataSync agent to do this. Following the guide at https://ystoneman.medium.com/serverless-datasync-from-efs-to-s3-6cb3a7ab85f7, I have created the following
- Security Group. I created this security group, and assigned it to the EC2 config for the datasync source location
resource "aws_security_group" "sg-datasync" {
name = "datasync"
vpc_id = "vpc-sampleVPC"
}
- DataSync Source Location (EFS)
resource "aws_datasync_location_efs" "source_efs" {
efs_file_system_arn = "arn:aws:elasticfilesystem:ap-southeast-2:XXXXX:file-system/fs-6b3f3753"
ec2_config {
security_group_arns = [aws_security_group.sg-datasync.arn]
subnet_arn = "arn:aws:ec2:ap-southeast-2:XXXXX:subnet/subnet-09d919d3b76e9c7f0"
}
}
- DataSync Target Location (S3)
resource "aws_datasync_location_s3" "target_s3" {
s3_bucket_arn = local.s3_arn
subdirectory = "/some_target_folder"
s3_config {
bucket_access_role_arn = local.s3_bucket_role_arn
}
}
- DataSync Task
resource "aws_datasync_task" "sampleTask" {
destination_location_arn = aws_datasync_location_s3.target_s3.arn
name = "sampleTask"
source_location_arn = aws_datasync_location_efs.source_efs.arn
options {
bytes_per_second = -1
}
}
In addition to this, I have created more security related stuffs:
- Security Group rule to allow inbound NFS access from DataSync source location security group (based on what the article says "On your EFS file system mount target’s security group, allow inbound access on port 2049 from the DataSync source location’s security group.")
resource "aws_security_group_rule" "datasync_to_efs" {
type = "ingress"
from_port = 2049
to_port = 2049
protocol = "tcp"
source_security_group_id = aws_security_group.sg-datasync.id
security_group_id = "sg-049fd2c6708c42c20"
}
- Security Group rule to allow all outbound access on all ports to EFS file system's mount target's security group. Again, this is based on the article "On your DataSync source location’s security group, allow all outbound access on all ports to your EFS file system’s mount target’s security group"
resource "aws_security_group_rule" "egress_datasync_to_efs" {
type = "egress"
from_port = 0
to_port = 65535
protocol = "tcp"
source_security_group_id = "sg-049fd2c6708c42c20"
security_group_id = aws_security_group.sg-datasync.id
}
Also note that 'sg-049fd2c6708c42c20' is the EFS file system's mount target security group. At least that is what I think it is, based on the screenshot below (this is taken from the EFS network configuration for fs-6b3f3753):
So with these, I can see the datasync task and locations created successfully. However, when I tried to run the task, I'm getting connection timed out:
"Task failed to access location loc-0bdebcc42541f73e4: x40016: mount.nfs: Connection timed out"
FYI: loc-0bdebcc42541f73e4 is the source location, and I can see from console, that it has the following details:
- Location ID: loc-0bdebcc42541f73e4
- Type: Amazon EFS file system
- Path: /
- File share: fs-6b3f3753
- Subnet: subnet-09d919d3b76e9c7f0
- Security groups: sg-0bb0d7ddb3dec8ca6
sg-0bb0d7ddb3dec8ca6 is the security group 'sg-datasync'. From console, it has no inbound, but it has one outbound rule:
- IP version: -
- Type: All TCP
- Protocol: TCP
- Port range: 0-65535
- Destination: sg-049fd2c6708c42c20
Looking at https://docs.aws.amazon.com/efs/latest/ug/troubleshooting-efs-mounting.html#mount-hangs-fails-timeout, it seems that either I didnt set the EC2 instance or the mount target security groups configuration correctly. My question are:
- Where is the EC2 instance configuration on my terraform above? Is it the aws_datasync_location_efs.source_efs.ec2_config ? My guess is.. AWS will spawn off an EC2 instance temporarily to access the EFS, and it is configured using this block ?
- Assuming no. 1 is correct, that EC2 has been configured using a) security group 'sg-datasync' b) the 'datasync_to_efs' rule has configured the mount target security group (sg-049fd2c6708c42c20) to allow inbound NFS access from the EC2 security group 'sg-datasync'.
Any help / pointer is very much appreciated!