0

I followed multiples tutorials to try to connect to Hive with RJDBC, without sucess.

Here is what I have:

library(DBI)
library(rJava)
library(RJDBC)


driver <- JDBC('org.apache.hive.jdbc.HiveDriver',
            classPath = list.files("/home/cdsw/R",pattern="jar$",full.names=T),
            identifier.quote="`")

USERNAME <- "MyUser"
PASSWORD <- "MySecretPassWord"
HOSTNAME <- "my.host.net"
PORT <- 10000

server <- sprintf('jdbc:hive2://%s:%s', HOSTNAME, PORT)

conn <- dbConnect(driver, server,
                  USERNAME, PASSWORD)

I have downloaded and place at "/home/cdsw/R/" the jar files.

list.files("/home/cdsw/R",pattern="jar$",full.names=T)

[1] "/home/cdsw/R/hadoop-common-2.6.0-cdh5.16.99.jar"
[2] "/home/cdsw/R/hive-jdbc-1.1.0-cdh5.16.99.jar"

I've also tried most recent versions, but always sync with the same Cloudera Version. Even if my version is 5.XX.

I'm quite sure the HOSTNAME is correct since I've made it work with impyla in Python with the same Hostname/port.

The Error:

Error in .jcall(drv@jdrv, "Ljava/sql/Connection;", "connect", as.character(url)[1], : java.lang.NoClassDefFoundError: org/apache/thrift/TException

From what I understand, I don't have the correct .jars?

Remark:

I can not install hive-jdbc on the machine since I'm not root. Can I do without it since I have place the hive-jdbc-1.1.0-cdh5.16.99.jar in a folder?

Also, could Kerberos trigger this error?

BeGreen
  • 765
  • 1
  • 13
  • 39

1 Answers1

0

I needed to download the standalone version of the hive driver.

hive-jdbc-3.1.2-standalone.jar, the standalone version does not require the full install of hive client.

BeGreen
  • 765
  • 1
  • 13
  • 39