I want/need to pass along the rowkey to the Reducer
, as the rowkey is calculated in advance, and the information is not available anymore at that stage. (The Reducer
executes a Put
)
First I tried to just use inner classes, e.g.
public class MRMine {
private byte[] rowkey;
public void start(Configuration c, Date d) {
// calc rowkey based on date
TableMapReduceUtil.initTableMapperJob(...);
TableMapReduceUtil.initTableReducerJob(...);
}
public class MyMapper extends TableMapper<Text, IntWritable> {...}
public class MyReducer extends TableReducer<Text, IntWritable, ImmutableBytesWritable> {...}
}
and both MyMapper
and MyReducer
have the default constructor defined. But this approach leads to the following exception(s):
java.lang.RuntimeException: java.lang.NoSuchMethodException: com.mycompany.MRMine$MyMapper.<init>()
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:115)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:719)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: java.lang.NoSuchMethodException: com.company.MRMine$MyMapper.<init>()
at java.lang.Class.getConstructor0(Class.java:2730)
at java.lang.Class.getDeclaredConstructor(Class.java:2004)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:109)
I got rid of the exception by declaring the inner classes static (Runtimeexception: java.lang.NoSuchMethodException: tfidf$Reduce.<init>()) . but then I'd have to make the rowkey
static as well, and I'm running multiple jobs in parallel.
I found https://stackoverflow.com/a/6739905/1338732 where the configure
method of the Reducer
is overwritten, but it doesn't seem to be available anymore. Anyhow, I wouldn't be able to pass along a value.
I was thinking of (mis)using (?) the Configuration, by just adding a new key-value pair, would this be working, and the correct approach?
Is there a way to pass along any custom value to the reducer?
the versions I'm using are: hbase: 0.94.6.1
, hadoop: 1.0.4