We have a requirement to get a .csv files from a bucket which is a client location (They would provide the S3 bucket info and other information required). Every day we need to pull this data into our S3 bucket so we can process it further. Please suggest the best way/technology that we can use to achieve the result.
I am planning to do it by Python boto (or Pandas or Pyspark) or Spark; reason being, once we get this data it might be processed further.