1

Below is the current Hadoop incompatibility issue we are running into.

USE-CASE

We are reading/scanning from HBASE(Version 0.96.1.2.0.6.1-101-hadoop2) running on New Hadoop (Version 2.2.0.2.0.6.0-101 [Hortonworks] ) and writing to Old Hadoop(Version 0.20.2+320 [Cloudera]) using a JAVA program. However we are getting exception due to incompatibility between the 2 Hadoop Versions.

The below snippet throws an exception:

private HbaseConfigFactory(String clusterUri, String hbaseRootdir) throws Exception {
    factoryImpl = HBaseConfiguration.create();
    factoryImpl.clear();

    factoryImpl.set("hbase.zookeeper.quorum", clusterUri);
    factoryImpl.set("zookeeper.znode.parent", hbaseRootdir);

    // set the zookeeper port
    String[] eles = clusterUri.split(":");
    if (eles.length > 1) {
        factoryImpl.set("hbase.zookeeper.property.clientPort", eles[1]);
    }

    try {
    //THE BELOW CODE CAUSE THE EXCEPTION
          HBaseAdmin.checkHBaseAvailable(factoryImpl);

    } catch (Exception e) {
        String message = String.format("HBase is currently unavailable: %s, %s",
                e.getMessage(), e);
        logger.error(message);

        throw new Exception(e);
    }

}

Bellow is the exception:

java.lang.Exception: java.lang.IllegalArgumentException: Can't find method getCurrentUser in org.apache.hadoop.security.UserGroupInformation! at com.shopping.writetold.HbaseConfigFactory.(HbaseConfigFactory.java:36) at com.shopping.writetold.HbaseConfigFactory.getInstance(HbaseConfigFactory.java:48) at com.shopping.writetold.WriteToHDFS.readDeals(WriteToHDFS.java:63) at com.shopping.writetold.WriteToHDFS.main(WriteToHDFS.java:50) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120) Caused by: java.lang.IllegalArgumentException: Can't find method getCurrentUser in org.apache.hadoop.security.UserGroupInformation! at org.apache.hadoop.hbase.util.Methods.call(Methods.java:45) at org.apache.hadoop.hbase.security.User.call(User.java:414) at org.apache.hadoop.hbase.security.User.callStatic(User.java:404) at org.apache.hadoop.hbase.security.User.access$200(User.java:48) at org.apache.hadoop.hbase.security.User$SecureHadoopUser.(User.java:221) at org.apache.hadoop.hbase.security.User$SecureHadoopUser.(User.java:216) at org.apache.hadoop.hbase.security.User.getCurrent(User.java:139) at org.apache.hadoop.hbase.client.HConnectionKey.(HConnectionKey.java:67) at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:240) at org.apache.hadoop.hbase.client.HBaseAdmin.checkHBaseAvailable(HBaseAdmin.java:2321) at com.shopping.writetold.HbaseConfigFactory.(HbaseConfigFactory.java:29) ... 8 more Caused by: java.lang.NoSuchMethodException: org.apache.hadoop.security.UserGroupInformation.getCurrentUser() at java.lang.Class.getMethod(Class.java:1624) at org.apache.hadoop.hbase.util.Methods.call(Methods.java:38) ... 18 more

Maven dependency Entries:

  <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-core</artifactId>
        <version>0.20.2</version>
    </dependency>

    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase-client</artifactId>
        <version>0.96.0-hadoop2</version>
    </dependency>

Jar details Maven: org.apache.hadoop:hadoop-common:2.1.0-beta hadoop-common-2.1.0-beta.jar

Method signature in class file UserGroupInformation public static synchronized org.apache.hadoop.security.UserGroupInformation getCurrentUser() throws java.io.IOException

Jar details Maven: org.apache.hadoop:hadoop-core:0.20.2 hadoop-core-0.20.2.jar

Method signature in class file UserGroupInformation static javax.security.auth.Subject getCurrentUser()

Both have the same name space which is: package org.apache.hadoop.security;

When I have separate program to read from hbase and write to cloudera HDFS with only their respective jars, they work fine.

Is there any solution for handling the above incompatibility in single program.

Thanks Sagar B

Sagar Bhosale
  • 71
  • 1
  • 3

1 Answers1

0

DISCLAIMER: As a prerequisite I take it that updating to latest uniform Hadoop library is out of the question, but I hardly know anything about Hadoop.

Essentially, you are in a conflict, because you will need both libraries on the classpath at the same time at runtime, which is hard task. In order to have two identical classes from different sources in the same VM you will need to use at least two different classloaders.

The thing to do in this scenario from a technical/architectural point of view is to decouple the two parts of the application. Either to run using different classloaders in the same VM or actually to decouple it out as heterogeneous programs exchanging messages using a shared mechanism (jms comes to mind, but there are plenty of alternatives).

Since you want to explore the single VM issue you are faced with two options. Doing it manually or using an application container that supports this (OSGI). In either case you will need to at least decouple the applications in maven to differentiate their dependencies etc.

Manually would mean having one part of the app in the current class loader, and then load the second part from a custom classloader, so presuming the write part is offloaded in a separate jar, create a custom clasloader which loads the old Hadoop jar (and transitives where applicable) and this separate jar file. Rather technical, but doable..

I found a reference question using the java.util.ServiceLoader that may highlight the topic, use at your own peril. (Dynamically loading plugin jars using ServiceLoader)

Aother decoupling solution that actually works for exactly this reason is the OSGI model, which allows jars to have their own separate runtime dependency tree in a pier-hierarchy, which essentially means that the same class may exist in multiple versions in the vm, since it is one classloader to every jar. However, OSGI is another beast for many other reasons, and requires a somewhat steep learning effort to really understand and utilize.

Community
  • 1
  • 1
Niels Bech Nielsen
  • 4,777
  • 1
  • 21
  • 44