1

I am trying to trigger a lambda function object arrival in s3 alonng with object details like name and path. then trigger python script on EMR which will access the file which is on s3. Please let me know how i can trigger python script (may within pig /hive script?) to process the file on EMR which will trigger this action?

To copy the file to local once we have the details from lambda trigger for hive/pig script.

s3_client = boto3.client('s3')
s3_client.download_file('s3:n//<bucket name>/','', '/home/hadoop/data/')

please let me know how it can be done. Files will come every 30-40 mins.

RajaR
  • 11
  • 4
  • Hey @RajaR, welcome to StackOverflow! To clarify the workflow you're asking about here -- are you trying to trigger a Lambda after an object is uploaded to S3, and have that Lambda kick off an EMR job that uses that object? – Nick Walsh May 07 '19 at 04:43
  • @nmwalsh - Here are the steps that what to achieve. Once object arrives in S3 bucket--> Will trigger Lambda with object details like the name & it should trigger a python script on EMR (python script will use this new S3 object to process) & place the output back on to S3 in a different bucket. Note: I cannot lunch cluster on each file. Want to run the process on running cluster. – RajaR May 07 '19 at 13:02

0 Answers0