0

For an undisclosed reason, my Impala does not have a JDBC driver installed. This is making the connection from R to Impala challenging.

I am able to connect (and query) to Impala shell via Putty. E.g.,

impala-shell --ssl -i some_name

Using the Putty connection mechanism/credentials, can this be performed from RStudio and bring in the SELECT results into a dataframe?

Scott
  • 446
  • 4
  • 16
  • Maybe, if the result is not too large, you can export it to csv by using `impala-shell --ssl -i some_name -q "query" --output_file --output_delimiter=` options? – mazaneicha Aug 18 '19 at 14:07
  • Can this command be run from R without JDBC driver? – Scott Aug 19 '19 at 20:23
  • You can probably execute it via remote shell, https://stackoverflow.com/questions/305035/how-to-use-ssh-to-run-a-shell-script-on-a-remote-machine. I'm no RStudio expert, sorry. – mazaneicha Aug 20 '19 at 11:39

2 Answers2

0

This worked in my Oracle BDA cluster.

library(dsreq)
print("Connecting to Impala...")
impaladb <- impalaConnection(pool='general')
dbResultsTempTbl <- dbGetQuery(impaladb, paste0("SELECT * FROM mytable") )

print("results")
print(dbResultsTempTbl)
KiranM
  • 1,306
  • 1
  • 11
  • 20
0

You can use the ODBC driver to connect to impalaDB

library(ODBC)
drv <- odbc::odbc()
con <- DBI::dbConnect(drv = drv, driver = "Cloudera ODBC Driver for Impala",
    host = "your hostname", port = 21050, Schema = "your schema")
fc9.30
  • 2,293
  • 20
  • 19