0

I have connected to an oracle database using R, I can perform queries on it but I'm having a problem with performing queries on a date column. I want to get the rows where a certain date column (INLDATE) is ranged between today and 7 days from now.

Other queries I performed with the dplyr package, like this;

tbl(con, 'UNIT') %>% select(SEQ) %>% filter(SEQ > 203)

This works perfectly fine, however when I try something like

Today <- Sys.Date()

tbl(con, 'UNIT') %>% select(INLDATE) %>% filter(INLDATE > Today)

or even

tbl(con, 'UNIT') %>% select(INLDATE) %>% filter(INLDATE > 2021-04-04)

it gives me errors that it expects a DATE but it's being interpreted as a number or a string

What is the correct way to do this? Thanks in advance!

enter image description here

BillyBouw
  • 314
  • 2
  • 10

1 Answers1

1

(This tests well with SQL Server, I don't have Oracle available, but the premise is the same.)

Your "date" resolves to an arithmetic expression:

2021-04-04
# [1] 2013

You can see how this is resolving in the dbplyr pipeline by using show_query():

tbl(con, "UNIT") %>%
  filter(INLDATE > 2021-05-07) %>%
  show_query()
# <SQL>
# SELECT *
# FROM "UNIT"
# WHERE ("INLDATE" > 2021.0 - 5.0 - 7.0)

What you need is either "2021-05-06" or as.Date("2021-05-06").

tbl(con, "UNIT") %>%
  filter(INLDATE > "2021-05-07") %>%
  show_query()
# <SQL>
# SELECT *
# FROM "UNIT"
# WHERE ("INLDATE" > '2021-05-07')

tbl(con, "UNIT") %>%
  filter(INLDATE > as.Date("2021-05-07")) %>%
  show_query()
# <SQL>
# SELECT *
# FROM "UNIT"
# WHERE ("INLDATE" > TRY_CAST('2021-05-07' AS DATE))

And to do "ranged between today and 7 days from now", you can use between:

tbl(con, "UNIT") %>%
  filter(between(INLDATE, "2021-04-04", "2021-04-11")) %>%
  show_query()
<SQL>
SELECT *
FROM "UNIT"
WHERE ("INLDATE" BETWEEN '2021-04-04' AND '2021-04-11')
r2evans
  • 141,215
  • 6
  • 77
  • 149
  • Thank you for the answer, however when I execute your query I get the following error; Error: nanodbc/nanodbc.cpp:1655: HY000: [Oracle][ODBC][Ora]ORA-01861: literal does not match format string 'SELECT * FROM ("UNIT") WHERE ("INLDATE" > '2021-05-07') FETCH FIRST 11 ROWS ONLY' – BillyBouw May 07 '21 at 13:37
  • Yeah, I have no idea what your schema looks like, so it's difficult to know for sure. Perhaps these can help? https://stackoverflow.com/q/22542882/3358272 or https://stackoverflow.com/q/1387917/3358272 (all I did was google [`oracle "literal does not match format string"`](https://www.google.com/search?q=oracle+"literal+does+not+match+format+string"), I don't know Oracle at all. Can you post the output from `tbl(con, "UNIT") %>% select(INLDATE) %>% head(1) %>% collect() %>% dput()`? – r2evans May 07 '21 at 13:49
  • tbl(con, "UNIT") %>% select(DELDATE) %>% head(1) %>% collect() %>% dput() structure(list(DELDATE = structure(1108425600, class = c("POSIXct", "POSIXt"), tzone = "UTC")), row.names = c(NA, -1L), class = c("tbl_df", "tbl", "data.frame")) – BillyBouw May 07 '21 at 14:03
  • It's possible that Oracle needs a full timestamp, not just a date (I really don't know). Try using `"2021-04-11 00:00:00"` or similar. – r2evans May 07 '21 at 14:04
  • It does seem like the dates are a date time () when I print it in the console, when I write to a table in csv however its a regular date. There is a separate column for the time so it shouldn't be like that, but however, I tried your suggestion but still no result :( – BillyBouw May 07 '21 at 14:13
  • I don't know what more I can actually do. Some thoughts: (1) Find out what Oracle thinks `UNIT.INLDATE` is, what class specifically, it might help inform follow-on action. (2) Most DBMSs provide the ability to cast a column in the conditional, perhaps `TRY_DATE('2021-04-04')`, find how to do that. (3) Do a raw SQL query (using `DBI::dbGetQuery(con, "select ...")` and see if you can get it to work that way. Once you figure out the SQL necessary, back-track that to see how to get `dbplyr` to speak that specific dialect (or just override it with `sql`. (`DBI` is my strength, not `dbplyr`, sorry.) – r2evans May 07 '21 at 14:19
  • Thank you very much, but I got the answer somewhere, the magic trick was tbl(con, "UNIT") %>% select(DELDATE) %>% filter(DELDATE >= TO_DATE('2020-12-09','YYYY-MM-DD')) – BillyBouw May 07 '21 at 14:35
  • Yeah, that was included in the two links I mentioned in the second comment. I'm glad they worked for you! – r2evans May 07 '21 at 14:45