3

I'm trying to copy a csv file from HDFS to S3 but the job fails with these errors:

Error: java.lang.RuntimeException: Reducer task failed to copy 1 files: 
hdfs://<ip>.ec2.internal:8020/output/data.csv/part-00000-5a0c6bcc-48eb-4390-9d14-13a2f7a4408b.csv etc
Error: java.lang.RuntimeException: Reducer task failed to copy 1 files: hdfs://<ip>.ec2.internal:8020/output/data.csv/_SUCCESS etc
Error: java.lang.RuntimeException: Reducer task failed to copy 1 files: hdfs://<ip>.ec2.internal:8020/output/data.csv/part-00000-5a0c6bcc-48eb-4390-9d14-13a2f7a4408b.csv etc
Error: java.lang.RuntimeException: Reducer task failed to copy 1 files: hdfs://<ip>.ec2.internal:8020/output/data.csv/_SUCCESS etc
Error: java.lang.RuntimeException: Reducer task failed to copy 1 files: hdfs://<ip>.ec2.internal:8020/output/data.csv/part-00000-5a0c6bcc-48eb-4390-9d14-13a2f7a4408b.csv etc
Error: java.lang.RuntimeException: Reducer task failed to copy 1 files: hdfs://<ip>.ec2.internal:8020/output/data.csv/_SUCCESS etc
Exception in thread "main" java.lang.RuntimeException: Error running job
Caused by: java.io.IOException: Job failed!

I've tried increasing the memory and also setting the number of workers to 1, my arguments are as follows:

-D s3DistCp.copyfiles.mapper.numWorkers=1 -D mapred.child.java.opts=-Xmx1024m --src=hdfs:///output/data.csv/ --dest=s3://<bucket>/<directory>/data.csv/

I also made sure the EMR role has full S3 access. Any advice on how to fix this error?

0 Answers0