3

I'm a novice AWS user and I'm trying to solve a use case where I need to import data from a csv that is dropped into an S3 bucket to RDS.

I have a csv file that will be uploaded to an S3 bucket, from there I want to run a custom Python script to run against that data, this script will build a set of metrics/scoring against the data. Next, I'd like to transform that output data from the Python script (to build multiple tables), and load it into tables to fit my RDS database schema.

I took a look at AWS Data Pipeline and AWS Glue but I'm not quite sure what services to use. Any ideas would be greatly appreciated.

Jackson
  • 6,391
  • 6
  • 32
  • 43
  • Are you specifically looking for some aws-service as a solution? or you can also do this by creating some functions in Python? – Mayank Porwal Nov 18 '18 at 03:53
  • That would work fine as well. I'm just trying to understand how I could use AWS services to leverage lets say a Python script. – Jackson Nov 18 '18 at 04:42
  • 1
    We have implemented a similar job using AWS Glue, wherein the daily delta files are uploaded to S3, and using AWS Glue job (python). We do not have any major transformation, but use AWS Glue just to perform UPSERTS to our Aurora RDS table. For job scheduling, I have used AWS Lambda, since i need to run glue job as soon as a file is uploaded to S3, instead of scheduled intervals. – Yuva Nov 18 '18 at 06:46
  • In my case I want to parse the csv from S3 and put into a normalized Aurora RDS with multiple tables. – Rajas Gujarathi Feb 18 '19 at 07:01

0 Answers0