12

I have been trying to write an R script to query Impala database. Here is the query to the database:

select columnA, max(columnB) from databaseA.tableA where columnC in (select distinct(columnC) from databaseB.tableB ) group by columnA order by columnA 

When I run this query manually (read: outside the Rscript via impala-shell), I am able to get the table contents. However, when the same is tried via the R script, I get the following error:

[1] "HY000 140 [Cloudera][ImpalaODBC] (140) Unsupported query."       
[2] "[RODBC] ERROR: Could not SQLExecDirect 'select columnA, max(columnB) from databaseA.tableA where columnC in (select distinct(columnC) from databaseB.tableB ) group by columnA order by columnA'
closing unused RODBC handle 1

Why does the query fail when tried via R? and how do I fix this? Thanks in advance :)

Edit 1:

The connection script looks as below:

library("RODBC");
connection <- odbcConnect("Impala");
query <- "select columnA, max(columnB) from databaseA.tableA where columnC in (select distinct(columnC) from databaseB.tableB ) group by columnA order by columnA";
data <- sqlQuery(connection,query);
Gowtham Ganesh
  • 340
  • 1
  • 2
  • 12
  • 1
    How does your connection script look – R4nc1d May 14 '15 at 12:21
  • 1
    I have updated my question with the connection script. – Gowtham Ganesh May 14 '15 at 12:53
  • 2
    I'm having the exact same problem. My ODBC connection works fine with a simple query, so it's not an ODBC issue. Did you find any solution @GowthamGanesh? – SummerEla May 20 '15 at 04:18
  • Unfortunately no. I think it doesnt support complex queries like "group by" – Gowtham Ganesh Jul 21 '15 at 11:31
  • From what I've found through various forums, RODBC does not seem to offer support for complex queries. I ended up switching to python and impyla. Ibis for impala also looks promising.... – SummerEla Oct 06 '15 at 05:38
  • I was having the same problem. I just updated the drivers as R4nc1d suggested, and now I can do complex queries with "group by" and multiple joins and unions – lever Feb 04 '16 at 09:27
  • 1
    I also have the same issue from R to run a stored procedure in SQL. However, I observed that it does complete the stored procedure but it also throws error. I think it has something to do with time, if it crosses certain time, it throws this error.. – Espanta Aug 19 '16 at 07:13
  • I had a similar issue except my SQL statement was an `EXEC`. It turned out that the problem was nested INSERT-EXEC, almost as though RODBC had been inserting an extra layer of INSERTs, so my stored procedure worked fine outside R but fell apart in R. Did you ever end up solving your problem? – lebelinoz May 15 '17 at 06:37

6 Answers6

2

You need to install the relevant drivers, please look at the following link

I had the same issue, all i had to do was update the ODBC drivers.

Also if you can update your odbcConnect with the username and password

connection <- odbcConnect("Impala");

to

connection <- odbcConnect("Impala", uid="root", pwd="password")
R4nc1d
  • 2,923
  • 3
  • 24
  • 44
  • 3
    I did try installing the relevant drivers. However, nothing looks promising. I did try simple queries and they seem to go through fine and the issue comes when I try using complex queries involving joins and the like. – Gowtham Ganesh Jun 02 '15 at 07:07
  • Did you check your ODBC configuration in your OS? – SqueakyBeak Jan 03 '20 at 19:23
1

The RODBC package is quirky: if there's no row updated/deleted in the query execution it will throw an error.

So before using sqlDelete to delete rows, or using sqlUpdate to update values, first check if there's at least one row that will be deleted/updated by querying COUNT(*).

I've had no problem after implementing the check, for Oracle SQL 12g.


An alternative would be to use a staging table for the new batch of data, and use sqlQuery to execute a MERGE command. RODBC won't complaint if there's zero row merged.

Teng L
  • 297
  • 3
  • 17
  • can't believe this didn't get more upvotes. thanks man. could have been here all night without figuring it out. – Guy Manova Jun 10 '22 at 02:06
0

This might also be due to an error in your sql query itself. For example, I got this error when I missed an 'in' in the following generalized statement. Example:

stringstuff <- someDT$columnyouwanttouse

somestring <- toString(sprintf("'%s'", stringstuff)) 

RESULTS <- sqlQuery(con,  paste0("select

                        fling as flam

                        and toot **in** (",somestring,")

                        limit 30
                        ;"))

I got the error you did when I left out the 'in', so double check your syntax.

SqueakyBeak
  • 366
  • 4
  • 15
0

This error message can arise if the table doesn't exist in the database.

A few sensible checks:

  • Check for typos in the table name in your query
  • See if you can run the same query on the same database via another sql client
  • Talk to your data base administrator to confirm that the table does exist
stevec
  • 41,291
  • 27
  • 223
  • 311
0

Re-installing the RODBC package did the trick for me!

Daniel
  • 3
  • 3
0

I had a similar problem. After unnisntalling the R version 4.2.1 and install the R version 4.1.3 the problem was solved.