3

I am trying to connect to hive jdbc using hikaricp (kerberos and keytab) in my spring batch project.

following is my jdbcDataSource config.

@Bean(name = "hiveJdbcBatchDataSource")
@Qualifier(value = "hiveJdbcBatchDataSource")
    public DataSource hiveJdbcBatchDataSource() throws Exception {

        try {
            HikariConfig config = new HikariConfig();
            config.setDriverClassName(driverClassName);
            config.setJdbcUrl(hiveUrl);

            System.setProperty("java.security.krb5.conf", krb5ConfPath);
            if (StringUtils.isNotBlank(keytabPath)) {
                org.apache.hadoop.conf.Configuration conf = new org.apache.hadoop.conf.Configuration();
                conf.set("hadoop.security.authentication", "kerberos");
                UserGroupInformation.setConfiguration(conf);
                UserGroupInformation.loginUserFromKeytab(principal, keytabPath);
            } else {
                System.setProperty("javax.security.auth.useSubjectCredsOnly", "false");
                config.setUsername(userName);
                config.setPassword(password);
            }

            config.setConnectionTestQuery("show databases");
            config.addDataSourceProperty("zeroDateTimeBehavior", zeroDateTimeBehavior);
            config.addDataSourceProperty("cachePrepStmts", cachePrepStmts);
            config.addDataSourceProperty("prepStmtCacheSize", prepStmtCacheSize);
            config.addDataSourceProperty("prepStmtCacheSqlLimit", prepStmtCacheSqlLimit);
            // connection pooling
            config.setPoolName(poolName);
            config.setMaximumPoolSize(maximumPoolSize);
            config.setIdleTimeout(idleTimeoutMs);
            config.setMaxLifetime(maxLifetimeMs);

            return new HikariDataSource(config);

        } catch (IOException e) {
            throw new BeanInitializationException("IOException Failed to init data souce.", e);
        } catch (Exception e) {
            throw new Exception("Exception Failed to init data souce.", e);
        }
    }

I am getting following Exception

Caused by: org.springframework.batch.core.configuration.BatchConfigurationException: java.lang.IllegalArgumentException: DatabaseType not found for product name: [Apache Hive]
    at org.springframework.batch.core.configuration.annotation.DefaultBatchConfigurer.initialize(DefaultBatchConfigurer.java:119)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleElement.invoke(InitDestroyAnnotationBeanPostProcessor.java:363)
    at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleMetadata.invokeInitMethods(InitDestroyAnnotationBeanPostProcessor.java:307)
    at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor.postProcessBeforeInitialization(InitDestroyAnnotationBeanPostProcessor.java:136)
    ... 16 common frames omitted
Caused by: java.lang.IllegalArgumentException: DatabaseType not found for product name: [Apache Hive]
    at org.springframework.batch.support.DatabaseType.fromProductName(DatabaseType.java:84)
    at org.springframework.batch.support.DatabaseType.fromMetaData(DatabaseType.java:123)
    at org.springframework.batch.core.repository.support.JobRepositoryFactoryBean.afterPropertiesSet(JobRepositoryFactoryBean.java:183)
    at org.springframework.batch.core.configuration.annotation.DefaultBatchConfigurer.createJobRepository(DefaultBatchConfigurer.java:134)
    at org.springframework.batch.core.configuration.annotation.DefaultBatchConfigurer.initialize(DefaultBatchConfigurer.java:113)
    ... 23 common frames omitted

My pom contains following dependencies

        <dependency>
            <groupId>org.apache.hive</groupId>
            <artifactId>hive-jdbc</artifactId>
            <version>1.2.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-common</artifactId>
            <version>2.7.3</version>
        </dependency>

Note: I have tried to follow following answer but still getting the same exception

Using Spring Batch with auto-configure and a non-standard database

axnet
  • 5,146
  • 3
  • 25
  • 45
  • One important detail, do you want to use Apache Hive as a job repository for Spring Batch or you just need to read/write data from/to Hive (and use another db for Spring Batch meta data) ? – Mahmoud Ben Hassine Oct 14 '19 at 07:36
  • @MahmoudBenHassine Thanks, just need to read/write data from/to Hive (and use another db for Spring Batch meta data) – axnet Oct 14 '19 at 08:02
  • @MahmoudBenHassine Actually, I am using spring batch version 4.1.2 , spring boot version 2.1.7 and setting spring.datasource.initialization-mode=never, spring.batch.initialize-schema=never. I just want to connect to hive using jdbc-url (hikaricp or SimpledataSource) and Kerberos keytab. On this hive I will execute queries in hive-tasklet. (note: my hive doesn't have username password) – axnet Oct 14 '19 at 08:08
  • thanks for the update. The error means that the job repository factory bean is trying to use hive as a datasource for spring batch meta data. In order to use another data source, please take a look at https://stackoverflow.com/questions/25540502/use-of-multiple-datasources-in-spring-batch – Mahmoud Ben Hassine Oct 14 '19 at 08:25
  • @MahmoudBenHassineThanks, for the link, Actually, I do not want to store the job repo. metadata (state management tables) into mysql also, that also acts as one of the db in my ETL pipe line – axnet Oct 14 '19 at 09:46
  • in that case, you can use the Map-based job repository (but this is not recommend for production) or use an embedded db like HSQL db or H2 (in which case, you need to point the job repository to use the embedded datasource as shown in the previous link). – Mahmoud Ben Hassine Oct 14 '19 at 09:54
  • @praxnet can you please show import statment of Datasoucre ?...just wanted to check dependency .I'm going with same problem – DunJen.coder Sep 04 '20 at 09:54

0 Answers0