If you are using the AWS Aurora flavour of RDS, you're in luck: https://aws.amazon.com/about-aws/whats-new/2017/06/amazon-aurora-can-export-data-into-amazon-s3/
If you are using normal MySql with RDS I have a workaround that achieved to redirect some data to .csv
. I used the aws data pipeline product with the CopyActivity
- it can copy data from a SQL query that you specify to a S3 bucket that you specify.
Here is a extract from my pipeline definition yml
that shows the input, output and CopyActivity
:
{
"myDescription": "Mysql datanode that represents the input database and query that gets the marketing table data.",
"type": "MySqlDataNode",
"name": "SourceRDSTable",
"id": "SourceRDSTable",
"connectionString": "jdbc:mysql://marketing.example.com:3306/schemaname",
"table": "marketing",
"selectQuery": "SELECT * FROM schemaname.marketing WHERE active=1 AND isOptedOut=0",
"username": "mysqluser",
"*password": "redacted"
},
{
"myDescription": "S3 datanode that represents the S3 directory where the table data will be stored.",
"type": "S3DataNode",
"name": "S3OutputLocation",
"id": "S3OutputLocation",
"filePath": "s3://mys3bucket/output/marketing_dump_#{format(@scheduledStartTime, 'YYYY-MM-dd')}.csv"
},
{
"myDescription": "CopyActivity used to dump a subset of the marketing table to S3 as csv",
"type": "CopyActivity",
"name": "Copy marketing table to S3",
"id": "RDStoS3CopyActivity",
"input": {
"ref": "SourceRDSTable"
},
"output": {
"ref": "S3OutputLocation"
},
"runsOn": {
"ref": "ec2_worker_instance"
}
}
(Although I'm happy with the results now that it's done, I have to admit it took me a week to set up this pipeline! It's been running daily for 4 months with no hitches but if I was to do it again, I would instead upgrde to Aurora and use their INTO OUTFILE S3
syntax)