21

Issue: RODBC (falsely) returning zero rows

Situation:

I'm using RODBC to connect to a DSN I created using a commercial DB's ODBC driver (OSI Soft's PI Historian Time Series DB, if you're curious).

> library(RODBC)
> piconn <- odbcConnect("PIRV", uid = "pidemo")
> sqlStr <- "SELECT tag, time, status, value FROM piinterp WHERE tag = 'PW1.PLANT1.PRODUCTION_RATE' and time > DATE('-4h') and timestep = '+2m'"

Now if I query, I get zero rows.

> sqlQuery(piconn, sqlStr)
[1] TAG    TIME   STATUS VALUE 
<0 rows> (or 0-length row.names)

With BelieveNRows = FALSE these all still show zero results, even though it should return 120 rows.

> sqlQuery(piconn, sqlStr, believeNRows = FALSE)
> sqlQuery(piconn, sqlStr, believeNRows = FALSE, max = 0)
> sqlQuery(piconn, sqlStr, believeNRows = FALSE, max = 0, buffsize = 120)

What else can I try?


Proof that there should be many rows:

In Excel or Command Prompt

SELECT tag, time, status, value FROM piinterp WHERE tag = 'PW1.PLANT1.PRODUCTION_RATE' and time > DATE('-4h') and timestep = '+2m'

With results...

TAG                         TIME            STATUS  VALUE
PW1.PLANT1.PRODUCTION_RATE  15/09/2011 9:33 448 0
PW1.PLANT1.PRODUCTION_RATE  15/09/2011 9:31 452 0
PW1.PLANT1.PRODUCTION_RATE  15/09/2011 9:29 390 0
PW1.PLANT1.PRODUCTION_RATE  15/09/2011 9:27 419 0
PW1.PLANT1.PRODUCTION_RATE  15/09/2011 9:25 413 0
PW1.PLANT1.PRODUCTION_RATE  15/09/2011 9:23 393 0
PW1.PLANT1.PRODUCTION_RATE  15/09/2011 9:21 427 0
etc

Both in R and in Excel, if I query for a tag that doesn't exist, say tag = 'aeeEEEEE11!!!', it correctly returns zero rows.


Additional Info

SQL Tables

> sqlTables(piconn)
   TABLE_QUALIFIER TABLE_OWNER TABLE_NAME TABLE_TYPE  REMARKS
1             <NA>        <NA>    pialias      TABLE  pialias
2             <NA>        <NA>      piavg      TABLE    piavg
3             <NA>        <NA>    pibatch      TABLE  pibatch
4             <NA>        <NA>     picomp      TABLE   picomp
5             <NA>        <NA>   piinterp      TABLE piinterp

ODBC info

> odbcGetInfo(piconn)
       DBMS_Name         DBMS_Ver  Driver_ODBC_Ver Data_Source_Name      Driver_Name       Driver_Ver         ODBC_Ver      Server_Name 
            "PI"     "03.04.0370"          "02.01"           "PIRV"   "PIODBC32.DLL"     "01.03.0100"     "03.52.0000"     "Aurvyzpis1" 

My session info :

 sessionInfo()
R version 2.12.2 (2011-02-25)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252    LC_MONETARY=English_Australia.1252 LC_NUMERIC=C                      
[5] LC_TIME=English_Australia.1252    

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ggplot2_0.8.9 proto_0.3-9.2 reshape_0.8.4 plyr_1.6      RODBC_1.3-3  

loaded via a namespace (and not attached):
[1] tools_2.12.2
Tommy O'Dell
  • 7,019
  • 13
  • 56
  • 69
  • 1
    Are any data returned if you omit the `WHERE` clause entirely? – Joshua Ulrich Sep 15 '11 at 02:13
  • I'd give you 10 up votes for a concise, descriptive and well thought out problem. I'm sure you'd rather have an answer but I can't help you there. Safe to assume there is no difference in running the table info commands for piinterp? Can you verify that a RODBC select statement does return rows (remove your where clause) to help isolate whether it's an issue with the filter in your query vs a generic issue getting data out? – billinkc Sep 15 '11 at 02:14
  • @Joshua, `HY000 -268 [OSI][PI ODBC][PI]Syntax error at offset 8 token -> [2] "[RODBC] ERROR: Could not SQLExecDirect 'SELECT tag, time, status, value FROM piinterp'` – Tommy O'Dell Sep 15 '11 at 03:17
  • @billinkc. I don't know if it's possible to return rows w/o a WHERE clause for this RDMBS. But either way, I'm positive that there should be rows returned for this query, as there are from the cmd prompt as well as in Excel. FYI - is my question that unclear? I thought it was alright, but I'll edit to see what I can do. – Tommy O'Dell Sep 15 '11 at 03:40
  • I would try the following. Can you run the SQL query directly on the database and not through ODBC? Can you run a trivial query through ODBC (like `SELECT 0 as n;`)? Can you connect to the database using ODBC from some other environment like VBA or MS Access? – adamleerich Sep 15 '11 at 04:16
  • As noted by Btibert3 on the following question http://stackoverflow.com/questions/11954734/rodbc-query-not-returning-data ODBC connections can get confused by date/timestamp formats and falsely report 0 rows. I've just seen this with RODBC and an Oracle database. Not retrieving the date/timestamp column corrected the problem for me. Hope this helps someone. – makeyourownmaker Mar 17 '17 at 12:53

5 Answers5

13

It turns out that all I needed to do was to set rows_at_time = 1 in addition to believeNRows = FALSE while setting up my ODBC connection.

piconn <- odbcConnect(dsn = "PI", uid = "pwd", believeNRows = FALSE, rows_at_time = 1)
sqlStr <- "SELECT tag, time, status, value FROM piinterp WHERE tag = 'RV1.MADST101_WINDSPEED' and time > DATE('-12h') and timestep = '+2m'"    
results <- sqlQuery(piconn, sqlStr)
Tommy O'Dell
  • 7,019
  • 13
  • 56
  • 69
3

Try adding

believeNRows = FALSE

to the query. This is an issue that has come up with a few of the drivers which report a wrong size on the result set.

Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
  • Hey Dirk, unfortunately `sqlQuery(piconn, sqlStr, believeNRows = FALSE)` still returns zero rows. As does `sqlQuery(piconn, sqlStr, believeNRows = FALSE, max = 0)` that I've seen others have success with (though, for Oracle). – Tommy O'Dell Sep 15 '11 at 03:23
  • It turns out that I needed to add `believeNRows = FALSE` and also `rows_at_time = 1`. Thanks Dirk. – Tommy O'Dell Jul 09 '12 at 00:19
3
  1. Test if the ODBC driver works correctly. Each ODBC driver should provide simple means to test the connection. Also try to connect to the ODBC source using MS Office (Access, Excel...) or Open Office.
  2. If the above works, then go to R and try the simplest query possible, like select 1 (your query isn't the simplest, as darcken noted! You have to try really the simplest query to be sure).
  3. If it doesn't work, try to call odbcGetErrMsg() function after each RODBC function call (after connect, after query, ...).
Tomas
  • 57,621
  • 49
  • 238
  • 373
1

I had the same problem and fixed it by adding "rows_at_time=1" to the odbcConnect call. From the odbcConnect help:

*Several errors which have been reported as bugs in RODBC 1.3-0 which were in fact ODBC driver errors that can be circumvented by setting rows_at_time = 1 (and the warning under that argument has always been there). The drivers involved have been third-party Oracle drivers and old SQL Server drivers.*

In my case I was running 64bit R 2.15.0, RODBC 1.3-5 and the Actual ODBC Oracle driver on OS X Lion.

  • Thanks Klaas! It turns out that you and Dirk were BOTH right. The combination of `rows_at_time = 1` and `believeNRows = FALSE` is what did the trick here. – Tommy O'Dell Jul 09 '12 at 00:20
0

I think you need to rule out that your actually connecting to the database/table first by getting SELECT * FROM MYTABLE to work within R. If you can't get this working then something is wrong with your setup/drivers.

Once you are sure that you can actually query the database/table within R then progressively make your query more complex and try to isolate where the issue is. One thing to try might be double == on your equality conditions.

I've tried using a bunch of databases(sql server,mysql,sqlite) within R and the performance has been poor with all of them. Imo your better off querying the database natively, dumping to text, then reading the file into R.

darckeen
  • 960
  • 2
  • 12
  • 20
  • The simplest query possible looks like `"SELECT * FROM piinterp WHERE time > DATE('-1h')"`. And this returns zero rows. You simply can't query this DB without `time > DATE()`. I've tried many other equally valid query variations which worked in Excel through the ODBC but return zero rows in R. I'm 100% sure all valid queries are returning zero rows in R. – Tommy O'Dell Sep 17 '11 at 09:16
  • As I said in my comment above, try a simple query like `SELECT 0 as n;`. It doesn't get any simpler. If that doesn't work then you know it is something with the driver. If it does, then there is something wrong with your query. – adamleerich Sep 18 '11 at 00:21
  • As I said in my comment above, a query like `SELECT 0 as n;` is not possible in this database. You simply have to specify a table name and date range in every query for this time-series database. – Tommy O'Dell Apr 11 '13 at 03:41