How to test a Mapper which emits key as null i.e. context.write(null, );

Question

I have a mapreduce program with only mapper and no reducer set. i want to test this. i am having below test code

@Test
    public void testMapper() throws IOException {

      mapDriver.withInput(new LongWritable(0l), new Text(
              "af00bac654249b9d27982f19064338f4,54.0258822077885,-1.56832133466378,20121022,105507,026542913532,2093,87"));
      mapDriver.withOutput(null, [some value]);
      mapDriver.runTest();
    }

with call of mapDriver.withOutput(null, [some value]); this line it is throwing below exception

java.lang.NullPointerException at org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:58) at org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:91) at org.apache.hadoop.mrunit.internal.io.Serialization.copyWithConf(Serialization.java:104)

Note: Mapper Generic Signature Mapper<LongWritable, Text, Void, GenericRecord>

Could somebody please let me know as how to write test scenarios for mapper which emit null ?

if i do Nullwritable.get then i am getting exception as below java.lang.NullPointerException at org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:73) at org.apache.hadoop.mrunit.internal.io.Serialization.copy(Serialization.java:91) at org.apache.hadoop.mrunit.internal.io.Serialization.copyWithConf(Serialization.java:104) at org.apache.hadoop.mrunit.TestDriver.copy(TestDriver.java:608) at org.apache.hadoop.mrunit.TestDriver.copyPair(TestDriver.java:612) at org.apache.hadoop.mrunit.TestDriver.addOutput(TestDriver.java:118) at org.apache.hadoop.mrunit.TestDriver.withOutput(TestDriver.java:138) at com.gfk.gxl.etl.common.ExtractCSVTest.testMapper(ExtractCSVTest.java:73) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

it seems more like MRUnit with Avro NullPointerException in Serialization however the answer is not solving my problem

 with few more research i have below update
    class org.apache.avro.generic.GenericData$Record is not able to get serializer and deserializer
    in org.apache.hadoop.mrunit.internal.io.Serialization and both are coming as null which is causing the null pointer exception



 From API code snippet  for org.apache.hadoop.mrunit.internal.io.Serialization starting at line      no 61 to 70

  try {
      serializer = (Serializer<Object>) serializationFactory
          .getSerializer(clazz);
      deserializer = (Deserializer<Object>) serializationFactory
          .getDeserializer(clazz);
    } catch (NullPointerException e) {
      throw new IllegalStateException(
          "No applicable class implementing Serialization in conf at io.serializations for "
              + orig.getClass(), e);
    }
above method serializer \ deserializer  are coming null . do we have some way to avoid it

score 2 · Answer 1 · answered Nov 07 '14 at 10:11

2

Use NullWritable.get() method insted.Hope this helps.

answered Nov 07 '14 at 10:11

sp_user123

512
3
6
28

Note: Mapper Generic Signature Mapper can you please mention what is the significance of void here. – sp_user123 Nov 11 '14 at 07:54
Did you solve the problem? I get this exactly problem as you now, my mapper is like Mapper – soulmachine Feb 04 '15 at 05:35

score 0 · Answer 2 · answered Feb 05 '15 at 00:15

0

Unfortunately, although Hadoop can accept null keys, you can not use null keys in MRUnit for now, the MRUnit team plans to support null keys in future, see here allow null keys and values as output, expected output

answered Feb 05 '15 at 00:15

soulmachine

3,917
4
46
56

How to test a Mapper which emits key as null i.e. context.write(null, );

2 Answers2