I am trying to setup a fairly simple pipeline (I think) where I can upload a csv in an S3 bucket and a Lambda function will be triggered on upload which will take the file, read it with Pandas and upload some of the data in a MySQL table running in RDS.
I have been following these two tutorials to learn how to do that:
- https://docs.aws.amazon.com/lambda/latest/dg/with-s3-example.html
- https://docs.aws.amazon.com/lambda/latest/dg/services-rds-tutorial.html#vpc-rds-prereqs
I have created the Lambda function to run inside the same VPC (the default VPC) that RDS is running on. The problem is that testing the function with the same test used in the first tutorial I am getting a "Task timed out after 3.00 seconds" error. I found this SO post and this one that suggests to create a VPC endpoint and I followed the link provided by Mark B to create the endpoint. However, testing the function gives again a task time out error. I've even followed this tutorial and created a bucket policy to allow access from the VPC endpoint but to no avail.
For the lambda function I have created an execution role with the following policies:
- AWSLambdaVPCAccessExecutionRole
- AmazonS3FullAccess
- the policy from the first tutorial
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:PutLogEvents",
"logs:CreateLogGroup",
"logs:CreateLogStream"
],
"Resource": "arn:aws:logs:*:*:*"
},
{
"Effect": "Allow",
"Action": [
"s3:GetObject"
],
"Resource": "arn:aws:s3:::*/*"
}
]
}
and the code I use for the function is
import json
import urllib.parse
import boto3
print('Loading function')
s3 = boto3.client('s3')
def lambda_handler(event, context):
print("Received event: " + json.dumps(event, indent=2))
# Get the object from the event and show its content type
bucket = event['Records'][0]['s3']['bucket']['name']
key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
try:
response = s3.get_object(Bucket=bucket, Key=key)
print("CONTENT TYPE: " + response['ContentType'])
print("Response: ", response)
return response['ContentType']
except Exception as e:
print(e)
print('Error getting object {} from bucket {}. Make sure they exist and your bucket is in the same region as this function.'.format(key, bucket))
raise e
The test I use for the function is the same as this one: https://docs.aws.amazon.com/lambda/latest/dg/with-s3-example.html#with-s3-example-test-dummy-event
In the Lambda function the resource-based policy statements looks like this
I would really appreciate any suggestions how to move on from here