Dears,
I have a Map Reduce application whose Mapper accepts input with the following Data Types for input key and input value.
public class The_Mapper extends MapReduceBase implements
Mapper <**VIntWritable, Text,** VIntWritable, VIntWritable>
The input file format in the Driver is SequenceFileInputFormat..
My input is a .txt file. To be accepted as an input to the application, I converted it to Sequence File with an Identity Mapper and Reducer, but since the input key for txt files can't be VIntWritable (should be LongWritable), the output key and output value from the Identity Reducer are in this Data types:
public class My_TextToSequenceReducer extends MapReduceBase implements Reducer <LongWritable, Text, **LongWritable, Text**>
However, this cannot be accepted as an input to the Mapper of the application as I described in first paragraph above and it throws
java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.VIntWritable
In the TexttoSequence Job setup, the Input format is declared as follows:
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(SequenceFileOutputFormat.class);
SequenceFileOutputFormat.setOutputCompressionType(conf, CompressionType.BLOCK);
Then, the output of this Job is given as an input to another Job with this setup:
conf.setInputFormat(SequenceFileInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
The application was written with Apache Hadoop 1.2.0 it is open-source and I forked it to re-use it. (The Hadoop version I use to run it is newer, i.e. 2.7.5, but I guess it's backward compatible ..., and this shouldn't be a problem)
How can I solve this? Thank you in advance!