0

I've encountered the following NPE when storm tries to deserialize. I did not use OutputCollector concurrently in my code. The only object we are passing between bolts are a thrift object, and we have written a serializer for it. I've attached the code of serializer and please help to check whether there are any potential bugs there.

2016-03-04 17:17:43.583 b.s.util [ERROR] Async loop died!
java.lang.RuntimeException: java.lang.NullPointerException
        at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:135) ~[storm-core-0.10.0.jar:0.10.0]
        at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:106) ~[storm-core-0.10.0.jar:0.10.0]
        at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80) ~[storm-core-0.10.0.jar:0.10.0]
        at backtype.storm.daemon.executor$fn__5694$fn__5707$fn__5758.invoke(executor.clj:819) ~[storm-core-0.10.0.jar:0.10.0]
        at backtype.storm.util$async_loop$fn__545.invoke(util.clj:479) [storm-core-0.10.0.jar:0.10.0]
        at clojure.lang.AFn.run(AFn.java:22) [clojure-1.6.0.jar:?]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_60]
Caused by: java.lang.NullPointerException
        at com.esotericsoftware.kryo.io.Input.setBuffer(Input.java:57) ~[kryo-2.21.jar:?]
        at backtype.storm.serialization.KryoTupleDeserializer.deserialize(KryoTupleDeserializer.java:47) ~[storm-core-0.10.0.jar:0.10.0]
        at backtype.storm.daemon.executor$mk_task_receiver$fn__5615.invoke(executor.clj:433) ~[storm-core-0.10.0.jar:0.10.0]
        at backtype.storm.disruptor$clojure_handler$reify__5189.onEvent(disruptor.clj:58) ~[storm-core-0.10.0.jar:0.10.0]
        at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:132) ~[storm-core-0.10.0.jar:0.10.0]
        ... 6 more
2016-03-04 17:17:43.584 b.s.d.executor [ERROR]
java.lang.RuntimeException: java.lang.NullPointerException
        at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:135) ~[storm-core-0.10.0.jar:0.10.0]
        at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:106) ~[storm-core-0.10.0.jar:0.10.0]
        at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80) ~[storm-core-0.10.0.jar:0.10.0]
        at backtype.storm.daemon.executor$fn__5694$fn__5707$fn__5758.invoke(executor.clj:819) ~[storm-core-0.10.0.jar:0.10.0]
        at backtype.storm.util$async_loop$fn__545.invoke(util.clj:479) [storm-core-0.10.0.jar:0.10.0]
        at clojure.lang.AFn.run(AFn.java:22) [clojure-1.6.0.jar:?]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_60]
Caused by: java.lang.NullPointerException
        at com.esotericsoftware.kryo.io.Input.setBuffer(Input.java:57) ~[kryo-2.21.jar:?]
        at backtype.storm.serialization.KryoTupleDeserializer.deserialize(KryoTupleDeserializer.java:47) ~[storm-core-0.10.0.jar:0.10.0]
        at backtype.storm.daemon.executor$mk_task_receiver$fn__5615.invoke(executor.clj:433) ~[storm-core-0.10.0.jar:0.10.0]
        at backtype.storm.disruptor$clojure_handler$reify__5189.onEvent(disruptor.clj:58) ~[storm-core-0.10.0.jar:0.10.0]
        at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:132) ~[storm-core-0.10.0.jar:0.10.0]
        ... 6 more
2016-03-04 17:17:43.648 b.s.util [ERROR] Halting process: ("Worker died")
java.lang.RuntimeException: ("Worker died")
        at backtype.storm.util$exit_process_BANG_.doInvoke(util.clj:336) [storm-core-0.10.0.jar:0.10.0]
        at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.6.0.jar:?]
        at backtype.storm.daemon.worker$fn__7188$fn__7189.invoke(worker.clj:536) [storm-core-0.10.0.jar:0.10.0]
        at backtype.storm.daemon.executor$mk_executor_data$fn__5523$fn__5524.invoke(executor.clj:261) [storm-core-0.10.0.jar:0.10.0]
        at backtype.storm.util$async_loop$fn__545.invoke(util.clj:489) [storm-core-0.10.0.jar:0.10.0]
        at clojure.lang.AFn.run(AFn.java:22) [clojure-1.6.0.jar:?]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_60]

import com.esotericsoftware.kryo.Kryo;
import com.esotericsoftware.kryo.Serializer;
import com.esotericsoftware.kryo.io.Input;
import com.esotericsoftware.kryo.io.Output;
import org.apache.thrift.TBase;
import org.apache.thrift.TDeserializer;
import org.apache.thrift.TException;
import org.apache.thrift.TSerializer;
import org.apache.thrift.protocol.TCompactProtocol;
import org.slf4j.Logger;

/**
 * @author zhangyf
 * @version 1.0 2016-02-19
 */

public class GZipThriftSerializer<T extends TBase> extends Serializer<T> {

  private static final int EMPTY = 0;

  // add DELTA to make length positive due to kryo internal optimization
  private static final int DELTA = 1;

  private static final Logger LOG = com.mediav.utils.Loggers.get();

  private static final ThreadLocal<TSerializer> COMPACT_T_SERIALIZER = new ThreadLocal<TSerializer>() {
    @Override
    protected TSerializer initialValue() {
      return new TSerializer(new TCompactProtocol.Factory());
    }
  };

  private static final ThreadLocal<TDeserializer> COMPACT_T_DESERIALIZER = new ThreadLocal<TDeserializer>() {
    @Override
    protected TDeserializer initialValue() {
      return new TDeserializer(new TCompactProtocol.Factory());
    }
  };

  {
    setAcceptsNull(true);
  }

  @Override
  public void write(Kryo kryo, Output output, T t) {
    if (t == null) {
      output.writeInt(EMPTY + DELTA, true);
      return;
    }
    try {
      byte[] data = COMPACT_T_SERIALIZER.get().serialize(t);
      byte[] compressed = GZipUtils.zip(data);
      output.writeInt(compressed.length + DELTA, true);
      output.writeBytes(compressed);
    } catch (TException e) {
      LOG.error("TException during serialization", e);
      output.writeInt(EMPTY + DELTA, true);
    }
  }

  @Override
  public T read(Kryo kryo, Input input, Class<T> aClass) {
    int length = input.readInt(true) - DELTA;
    if (length == 0) {
      return null;
    }
    try {
      T t = aClass.newInstance();
      COMPACT_T_DESERIALIZER.get().deserialize(t, GZipUtils.unzip(input.readBytes(length)));
      return t;
    } catch (InstantiationException e) {
      throw new RuntimeException(e);
    } catch (IllegalAccessException e) {
      throw new RuntimeException(e);
    } catch (TException e) {
      LOG.error("TException during deserialization", e);
    }
    return null;
  }
}
Kevin Cruijssen
  • 9,153
  • 9
  • 61
  • 135
Liu Renjie
  • 220
  • 2
  • 11
  • 1
    Possible duplicate of [What is a Null Pointer Exception, and how do I fix it?](http://stackoverflow.com/questions/218384/what-is-a-null-pointer-exception-and-how-do-i-fix-it) – Tim Biegeleisen Mar 17 '16 at 08:13
  • Did you register your serializer using `registerSerialization`? In the stack trace it doesn't look like it is even getting in to your `GZipThriftSerializer`. – Kit Menke Mar 17 '16 at 13:09
  • Yes, I did that. That's why I get so confusing since it seems like a system problem but I can not figure out what's the problem. – Liu Renjie Mar 17 '16 at 13:28
  • 1
    **This is caused by having multiple threads issue methods on the `OutputCollector`. All emits, acks, and fails must happen on the same thread. One subtle way this can happen is if you make a `IBasicBolt` that emits on a separate thread. `IBasicBolt`'s automatically `ack` after execute is called, so this would cause multiple threads to use the `OutputCollector` leading to this exception. When using a basic `bolt`, all emits must happen in the same thread that runs `execute`.** From Storm site: [storm NPE](http://storm.apache.org/releases/0.10.0/Troubleshooting.html) – Darpan27 Oct 27 '16 at 13:30

1 Answers1

0

I'm answering a 3 year old question because I spent quite some time figuring out the same issue myself. We also encountered a NullPointerException when deserializing tuples in Storm, and we didn't do anything multi-threaded in our bolts, in the way Darpan27 is suggesting.

It turned out that we were using the same object in 2 separate bolts that were running concurrently in the same JVM, and one of the bolts modified the object, while the other bolt sent it to the OutputCollector. Cloning the object before modifying it did resolve our problem. Our final solution was to write a custom Kryo serializer that ignored the properties that were modified (as we didn't need them further downstream).

Thomas
  • 56
  • 2