0

I'm using:

  • hadoop-client 2.2.0
  • mrunit 1.0.0
  • avro 1.7.6
  • avro-mrunit 1.7.6

... and the entire thing is being built and tested using Maven.

I was getting a NullPointerException until I followed the instructions at MRUnit with Avro NullPointerException in Serialization.

Now I am getting an InstantiationException:

Running mypackage.MyTest
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
2014-03-23 20:49:21.463 java[27994:1003] Unable to load realm info from SCDynamicStore
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.945 sec <<< FAILURE!
process(mypackage.MyTest)  Time elapsed: 0.909 sec  <<< ERROR!
java.lang.RuntimeException: java.lang.InstantiationException
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
    at org.apache.hadoop.io.serializer.SerializationFactory.add(SerializationFactory.java:72)
    at org.apache.hadoop.io.serializer.SerializationFactory.<init>(SerializationFactory.java:63)
    at org.apache.hadoop.mrunit.internal.io.Serialization.<init>(Serialization.java:37)
    at org.apache.hadoop.mrunit.TestDriver.getSerialization(TestDriver.java:464)
    at org.apache.hadoop.mrunit.TestDriver.copy(TestDriver.java:608)
    at org.apache.hadoop.mrunit.TestDriver.copyPair(TestDriver.java:612)
    at org.apache.hadoop.mrunit.MapDriverBase.addInput(MapDriverBase.java:118)
    at org.apache.hadoop.mrunit.MapDriverBase.withInput(MapDriverBase.java:207)
    at mypackage.MyTest.process(MyTest.java:92)
...

The Avro model looks like this:

{
    "namespace": "model",
    "type": "record",
    "name": "Blob",
    "fields": [
        { "name": "value", "type": "string" }
    ]
}

The mapper looks like this:

public class MyMapper
    extends Mapper<AvroKey<Blob>, NullWritable, LongWritable, NullWritable>
{
    @Override
    public void map(AvroKey<Blob> key, NullWritable value, Context context)
            throws IOException, InterruptedException {
        context.write(new LongWritable(0), NullWritable.get());
    }
}

The test that is failing (the only test I have at the moment) looks like this:

@Test
public void process() throws IOException {
    mapper = new MyMapper();
    job = Job.getInstance();
    mapDriver = MapDriver.newMapDriver(mapper);

    Configuration configuration = mapDriver.getConfiguration();
    //Copy over the default io.serializations. If you don't do this then you will 
    //not be able to deserialize the inputs to the mapper
    String[] serializations = configuration.getStrings("io.serializations");
    serializations = Arrays.copyOf(serializations, serializations.length + 1);
    serializations[serializations.length-1] = AvroSerialization.class.getName();
    configuration.setStrings("io.serializations", serializations);

    //Configure AvroSerialization by specifying the key writer and value writer schemas
    configuration.setStrings("avro.serialization.key.writer.schema", Schema.create(Schema.Type.LONG).toString(true));
    configuration.setStrings("avro.serialization.value.writer.schema", Schema.create(Schema.Type.NULL).toString(true));

    job.setMapperClass(MyMapper.class);
    job.setInputFormatClass(AvroKeyInputFormat.class);
    AvroJob.setInputKeySchema(job, Blob.SCHEMA$);
    job.setOutputKeyClass(LongWritable.class);

    input = Blob.newBuilder()
        .setValue("abc")
        .build();
    mapDriver
        .withInput(new AvroKey<Blob>(input), NullWritable.get())
        .withOutput(new LongWritable(0), NullWritable.get())
        .runTest();
}

I'm pretty new with both Avro and MRUnit, so I am still trying to fully understanding the workings between them. In the unit test output I see warnings about log4j and don't know for certain that this isn't part of the problem (thought I doubt it).

Community
  • 1
  • 1
seawolf
  • 2,147
  • 3
  • 20
  • 37

1 Answers1

1

Try this one; though It is error from ReflectionUtil but the outer frame is about Serialization and you do not implement your writable. So I think it may be about avro serializaion is not set right.

        MapDriver driver = MapDriver.newMapDriver(your mapper);

        Configuration conf = driver.getConfiguration();
        AvroSerialization.addToConfiguration(conf);
        AvroSerialization.setKeyWriterSchema(conf, your schema);
        AvroSerialization.setKeyReaderSchema(conf, your schema);
        Job job = new Job(conf);
        job.set... your job settings;
        AvroJob.set... your avro job settings;
WeiChing 林煒清
  • 4,452
  • 3
  • 30
  • 65
  • That did it... the three AvroSerialization.xxx() calls were the piece I was missing. Thank you **so** much! – seawolf Mar 30 '14 at 06:22
  • FWIW, I had tried that early-on (because it was shown here: http://stackoverflow.com/questions/15230482/mrunit-with-avro-nullpointerexception-in-serialization) and it didn't work. I wonder what changed in my code so that it works now. Strange. – seawolf Mar 31 '14 at 08:20
  • The one issue with this is that new Job(conf) creates a copy of the configuration which means that any configuration applied via the Job object will not be reflected when the mapper runs. You'll have to set your configuration on the original configuration object returned from the driver instead. – Sam M Oct 15 '14 at 23:18