I have an hadoop job that outputs many parts to hdfs for example to some folder.
For example:
/output/s3/2014-09-10/part...
What is the best way, using s3 java api to upload those parts to signle file in s3
For example
s3:/jobBucket/output-file-2014-09-10.csv
As a possible solution there is an option to merge the parts and write the result to hdfs single file, but this will create a double I/O. Using single reducer is not option as well
Thanks,