I followed multiples tutorials to try to connect to Hive with RJDBC, without sucess.
Here is what I have:
library(DBI)
library(rJava)
library(RJDBC)
driver <- JDBC('org.apache.hive.jdbc.HiveDriver',
classPath = list.files("/home/cdsw/R",pattern="jar$",full.names=T),
identifier.quote="`")
USERNAME <- "MyUser"
PASSWORD <- "MySecretPassWord"
HOSTNAME <- "my.host.net"
PORT <- 10000
server <- sprintf('jdbc:hive2://%s:%s', HOSTNAME, PORT)
conn <- dbConnect(driver, server,
USERNAME, PASSWORD)
I have downloaded and place at "/home/cdsw/R/"
the jar
files.
list.files("/home/cdsw/R",pattern="jar$",full.names=T)
[1] "/home/cdsw/R/hadoop-common-2.6.0-cdh5.16.99.jar"
[2] "/home/cdsw/R/hive-jdbc-1.1.0-cdh5.16.99.jar"
I've also tried most recent versions, but always sync with the same Cloudera Version. Even if my version is 5.XX.
I'm quite sure the HOSTNAME
is correct since I've made it work with impyla
in Python with the same Hostname/port.
The Error:
Error in .jcall(drv@jdrv, "Ljava/sql/Connection;", "connect", as.character(url)[1], : java.lang.NoClassDefFoundError: org/apache/thrift/TException
From what I understand, I don't have the correct .jar
s?
Remark:
I can not install hive-jdbc on the machine since I'm not root. Can I do without it since I have place the
hive-jdbc-1.1.0-cdh5.16.99.jar
in a folder?
Also, could Kerberos trigger this error?