0

I have created one program which is processing two data files, one contain customer data and other contain transaction data and I applied the reduce join on this data file and processed the file and output like Customer name number of transaction total amount Amit Kumar 4 120000000 Kawaldeep Sing 5 20000000 Sanosh singh 6 10000000

And now I want that when the program run output of each name goes in seprate file like if one row contain data about Amit Kumar then this data goes into the file named Amit similarly for other record.

And if above scenario is possible then if the job runs in every 5 minute then how we can append the output to same file.

Please help me on this.

Thanks & Regard Amit

Amit Kumar
  • 83
  • 1
  • 13

2 Answers2

0

Let the name like Amit,Kawaldeep etc be the key outputs from mappers. So the reducer process entire data for one key mapper output and let the reducer output also be the same key. We can override the MultipleTextOutputFormat class to have separate output files for each of keys. The below code may be useful.

    /**
     * Create output files based on the output record's key name.
     */
    static class KeyBasedMultipleTextOutputFormat
                 extends MultipleTextOutputFormat<Text, Text> {
        @Override
        protected String generateFileNameForKeyValue(Text key, Text value, String name) {
            return key.toString() + "/" + name;
        }
    }

And in the job class

jobConf.setOutputFormat(KeyBasedMultipleTextOutputFormat.class);
madhu
  • 1,140
  • 8
  • 14
  • When I am using your above code in my program it is saying that the method setOutputFormatClass in the type job is not applicable for the argument. and when I checked the above import class it is importing the old API class mapred not new one mapreduce. – Amit Kumar Oct 23 '15 at 08:15
0

Look for:

org.apache.hadoop.mapreduce.lib.output.MultipleOutputs

https://hadoop.apache.org/docs/r2.6.1/api/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.html

For appending files: merge output files after reduce phase

Hope it's useful.

Community
  • 1
  • 1
Anatoly Deyneka
  • 1,238
  • 8
  • 13