1

I have a Java web service (made with a proprietary technology at my company), which is being used to service requests/responses, and while processing requests, it is attempting to talk to Hadoop's Hive and execute a query. However, it is failing immediately when I simply try to initialize the connection.

Here is the line of code it fails on. I am largely using the code sample from https://cwiki.apache.org/confluence/display/Hive/HiveClient:

String connString = "jdbc:hive://";
Connection con = DriverManager.getConnection(connString, "", "");

Here is the stack trace:

javax.jdo.JDOFatalInternalException: Unexpected exception caught.
    at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1186)
    at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:803)
    at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:698)
    at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:246)
    at org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:275)
    at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:208)
    at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:183)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:70)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:407)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.executeWithRetry(HiveMetaStore.java:359)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:504)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:266)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.<init>(HiveMetaStore.java:228)
    at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.<init>(HiveServer.java:131)
    at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.<init>(HiveServer.java:121)
    at org.apache.hadoop.hive.jdbc.HiveConnection.<init>(HiveConnection.java:76)
    at org.apache.hadoop.hive.jdbc.HiveDriver.connect(HiveDriver.java:104)
    at java.sql.DriverManager.getConnection(DriverManager.java:582)
    at java.sql.DriverManager.getConnection(DriverManager.java:185)
    at (...my package...).RemoteCtrbTest.kickOffRemoteTest(RemoteCtrbTest.java:52)

NestedThrowablesStackTrace:
java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at javax.jdo.JDOHelper$16.run(JDOHelper.java:1958)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.jdo.JDOHelper.invoke(JDOHelper.java:1953)
    at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1159)
    at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:803)
    at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:698)
    at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:246)
    at org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:275)
    at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:208)
    at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:183)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:70)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:407)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.executeWithRetry(HiveMetaStore.java:359)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:504)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:266)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.<init>(HiveMetaStore.java:228)
    at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.<init>(HiveServer.java:131)
    at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.<init>(HiveServer.java:121)
    at org.apache.hadoop.hive.jdbc.HiveConnection.<init>(HiveConnection.java:76)
    at org.apache.hadoop.hive.jdbc.HiveDriver.connect(HiveDriver.java:104)
    at java.sql.DriverManager.getConnection(DriverManager.java:582)
    at java.sql.DriverManager.getConnection(DriverManager.java:185)
    at (...my package...).RemoteCtrbTest.kickOffRemoteTest(RemoteCtrbTest.java:52)

Caused by: org.datanucleus.exceptions.NucleusUserException: Persistence process has been specified to use a ClassLoaderResolver of name "jdo" yet this has not been found by the DataNucleus plugin mechanism. Please check your CLASSPATH and plugin specification.
    at org.datanucleus.OMFContext.getClassLoaderResolver(OMFContext.java:319)
    at org.datanucleus.OMFContext.<init>(OMFContext.java:165)
    at org.datanucleus.OMFContext.<init>(OMFContext.java:137)
    at org.datanucleus.ObjectManagerFactoryImpl.initialiseOMFContext(ObjectManagerFactoryImpl.java:132)
    at org.datanucleus.jdo.JDOPersistenceManagerFactory.initialiseProperties(JDOPersistenceManagerFactory.java:363)
    at org.datanucleus.jdo.JDOPersistenceManagerFactory.<init>(JDOPersistenceManagerFactory.java:307)
    at org.datanucleus.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:255)
    at org.datanucleus.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:182)
    ... 35 more

I found one other question that has a similar error message, but it was about Maven and didn't contain Hive (which is the one using DataNucleus in it's code): Datanucleus, JDO and executable jar - how to do it?

I am using a hive-site.xml file to specify some properties for hive and datanucleus. The datanucleus ones are below. The last two I tried to try to fix the issue and when I change whatever I specify for datanucleus.classLoaderResolverName, it changes the error message that is in quotes.

<property>
  <name>datanucleus.autoCreateSchema</name>
  <value>false</value>
</property>

<property>
  <name>datanucleus.fixedDatastore</name>
  <value>true</value>
</property>

<property>
  <name>datanucleus.classLoaderResolverName</name>
  <value>jdo</value>
</property>

<property>
  <name>javax.jdo.PersistenceManagerFactoryClass</name>
  <value>org.datanucleus.jdo.JDOPersistenceManagerFactory</value>
</property>

The part that I cannot figure out is if somehow the service is re-bundling the jars, as in the other stackoverflow question that I linked to above, messing up the location of plugin.xml and/or the Manifest.mf file. I'm also not sure how the plugin file interacts with the hive-site file.

The classpath here requires to add specific jars instead of just a classpath. I am using the following datanucleus jars: * datanucleus-connectionpool-2.0.3.jar * datanucleus-enhancer-2.0.3.jar * datanucelus-rdbms-2.0.3.jar * datanucleus-core-2.0.3.jar

Any input you can give to help me would be greatly appreciated. I can provide more information if you need it so please do ask.

Community
  • 1
  • 1
jmo
  • 11
  • 2

2 Answers2

1

In case you are using Spring framework in your application made with proprietary technology at your company, then you can take advantage of Spring-Hadoop support.

All you have to do is just add below configuration in your applicationContext:

<hdp:configuration>
    fs.default.name=${fs.default.name.url}
    mapred.job.tracker=${mapred.job.tracker.url}
</hdp:configuration>

<hdp:hive-client-factory host="${hadoop.hive.host.url}" port="10000"
    xmlns="http://www.springframework.org/schema/hadoop" />

<hdp:hive-template />

After that, autowire HiveTemplate,

@Autowired
HiveTemplate hiveTemplate;

And then query Hive as shown below:

List<String> list = hiveTemplate.query(queryString, parameterMap);
Shishir Kumar
  • 7,981
  • 3
  • 29
  • 45
0

DataNucleus apparently uses an OSGi-based plugin mechanism. If you are not running it in an OSGi container and are just using a standard maven project, what's probably happening is the plugins are on the classpath but are not registered due to an issue with manifests. You might try something like this:

<build>
    <plugins>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-dependency-plugin</artifactId>
            <version>2.4</version>
            <executions>
                <execution>
                    <id>copy-dependencies</id>
                    <phase>package</phase>
                    <goals>
                        <goal>copy-dependencies</goal>
                    </goals>
                    <configuration>
                        <outputDirectory>${project.build.directory}/jars</outputDirectory>
                        <overWriteReleases>false</overWriteReleases>
                        <overWriteSnapshots>false</overWriteSnapshots>
                        <overWriteIfNewer>true</overWriteIfNewer>
                    </configuration>
                </execution>
            </executions>
        </plugin>
    </plugins>
</build>

This was answered previously.

Community
  • 1
  • 1
Peter G
  • 1,613
  • 10
  • 10