4

I am trying to write Scalding jobs which have to connect to HBase, but I have trouble using the HBase tap. I have tried using the tap provided by Twitter Maple, following this example project, but it seems that there is some incompatibility between the Hadoop/HBase version that I am using and the one that was used as client by Twitter.

My cluster is running Cloudera CDH4 with HBase 0.92 and Hadoop 2.0.0-cdh4.1.3. Whenever I launch a Scalding job connecting to HBase, I get the exception

java.lang.NoSuchMethodError: org.apache.hadoop.net.NetUtils.getInputStream(Ljava/net/Socket;)Ljava/io/InputStream;
    at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:363)
    at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1046)
...

It seems that the HBase client used by Twitter Maple is expecting some method on NetUtils that does not exist on the version of Hadoop deployed on my cluster.

How do I track down what exactly is the mismatch - what version would the HBase client expect and so on? Is there in general a way to mitigate these issues?

It seems to me that often client libraries are compiled with hardcoded version of the Hadoop dependencies, and it is hard to make those match the actual versions deployed.

Andrea
  • 20,253
  • 23
  • 114
  • 183

1 Answers1

7

The method actually exists but has changed its signature. Basically, it boils down to having different versions of Hadoop libraries on your client and server. If your server is running Cloudera, you should be using the HBase and Hadoop libraries from Cloudera. If you're using Maven, you can use Cloudera's Maven repository.

It seems like library dependencies are handled in Build.scala. I haven't used Scala yet, so I'm not entirely sure how to fix it there.

The change that broke compatibility was committed as part of HADOOP-8350. Take a look at Ted Yu's comments and the responses. He works on HBase and had the same issue. Later versions of the HBase libraries should automatically handle this issue, according to his comment.

kichik
  • 33,220
  • 7
  • 94
  • 114
  • Thank you, I suspected this. The problem is that the HBase client version is hardcoded in the Twitter Maple tap. So basically my only chance is to compile myself the Twitter maple collection with the richt dependencies? Or there are simpler ways to make it work? – Andrea Mar 29 '13 at 08:13
  • If you're using Maven, you can [override dependencies](http://stackoverflow.com/questions/3937195/maven-how-to-override-the-dependency-added-by-a-library). Worst case scenario, build just your JAR and point the classpath to the right version of HBase. – kichik Mar 30 '13 at 01:40
  • 2
    faced this issue with hadoop 0.23.7, but looking into sources - found right signiture in 0.23.1 (full story: http://www.yetanothercoder.ru/2013/05/2-days-of-integration-of-osgi-hadoop.html) – yetanothercoder May 28 '13 at 16:05