5

I have a data frame that has a "DATE" field. e.g.: "24-10-2015"

The variable is in the date format.

When I use sqldf, e.g.: select min(DATE), MAX (DATE) from table ... the output is a number like 16623.

Tried FORMAT and CONVERT but they don't work in sqldf.

Any hints?

zx8754
  • 52,746
  • 12
  • 114
  • 209
saxo
  • 81
  • 1
  • 2
  • 7
  • 1
    SO creates long-term usable information for current & future readers. Questions should be self-contained with the minimal code & data needed so readers can copy it from the question & paste it into a running R session & see what the poster sees. The data should be part of the question & not elsewhere to ensure that it continues to be vailable. Use the R command `dput(X)` to provide X in reproducible form. See (1) http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example (2) http://stackoverflow.com/help/mcve (3) http://stackoverflow.com/help/how-to-ask – G. Grothendieck Nov 05 '15 at 15:17

4 Answers4

8

Specify the methods for each column in the data frame. Assume 'data' is the name of the data frame with the column name 'd' containing the 'Date' format.

Try the following:

sqldf('select max(d) as MAX__Date,
              min(d) as MIN__DATE
       from data',
      method = "name__class")

This should work.

zx8754
  • 52,746
  • 12
  • 114
  • 209
Murugesan
  • 131
  • 4
  • 1
    This was my preferable method, since no changes are made on the data of the data frame. And IT WORKS! – saxo Nov 05 '15 at 16:24
4

I suggest you transform your date to POSIXct using as.POSIXct so that you can work with the date function of SQLite:

Using some random data:

#notice I keep the class of the date as POSIXct
#it really does not change anything
df <- data.frame(date = as.POSIXct('2015-01-01'))

#> df
#        date
#1 2015-01-01

And then you can do:

#using the date function in SQLite you convert the nanoseconds
#produced by min(date) back to a date.
sqldf('select date(min(date), "unixepoch", "localtime") from df')
  date(min(date), "unixepoch", "localtime")
1                                2015-01-01

And you have what you need. There is more info about how SQLite understands dates here

LyzandeR
  • 37,047
  • 12
  • 77
  • 87
  • An other solution is to use `as.Date(16623)` which will return "2015-07-07" – etienne Nov 05 '15 at 15:31
  • Thanks @etienne . How do you mean? – LyzandeR Nov 05 '15 at 15:33
  • When saxo runs his code the output are numbers so after his code he can convert the numbers into dates using `as.Date(number)` which will return a Date. – etienne Nov 05 '15 at 15:36
  • @etienne I see. So, you mean after running the `sqldf` function. This is a possibility for sure. You still need to provide the `origin` argument in order to work but you are right. You may want to write it as an answer. Thanks :) – LyzandeR Nov 05 '15 at 15:38
3

An other possibility is to not change your sqldf function and then to convert your dates stored as numbers. You can use as.Date() for this :

zoo::as.Date(16623)
[1] "2015-07-07"

As LyzandeR mentioned, you should specify an origin which states what the first date is. If you are using the zoo package, the default is "1970-01-01" and for your format it is probably the correct origin, but if you don't use it (meaning you sue the function from the base package then you must specify it.

as.Date(16623, origin = "1970-01-01")
[1] "2015-07-07"

But if you had dates from Excel you should change the origin :

zoo::as.Date(42313)
[1] "2085-11-06"

as.Date(42313, origin = "1899-12-30") # for Windows, use "1904-01-01" for Mac
[1] "2015-11-05" # correct result

I actually found why not supplying origin was working for me : I had the package zoo loaded, in which "1970-01-01" is the default option for origin:

base::as.Date(16623)
Error in as.Date.numeric(16623) : 'origin' must be supplied

zoo::as.Date(16623)
[1] "2015-07-07"

Here are the codes where you can see that zoo specifies a default origin for the function as.Date.numeric which is not the case for the base package :

base::as.Date.numeric
function (x, origin, ...) 
{
    if (missing(origin)) 
        stop("'origin' must be supplied")
    as.Date(origin, ...) + x
}
<bytecode: 0x17190e78>
<environment: namespace:base>

zoo::as.Date.numeric
function (x, origin, ...) 
{
    if (missing(origin)) 
        origin <- "1970-01-01"
    if (identical(origin, "0000-00-00")) 
        origin <- as.Date("0000-01-01", ...) - 1
    as.Date(origin, ...) + x
}
<environment: namespace:zoo>
etienne
  • 3,648
  • 4
  • 23
  • 37
  • Good one! On my computer `as.Date(16623)` returns an error: `'origin' must be supplied` . This is probably a < R 3.2 error. – LyzandeR Nov 05 '15 at 15:50
  • 1
    @LyzandeR: I don't know why it works on my computer as I have R 3.0.2 But in the help it is written "as.Date will accept numeric data (the number of days since an epoch), but only if origin is supplied." so the best practice is to include the origin – etienne Nov 05 '15 at 15:57
0

If you only need to check specific date values and not date ranges, you could convert your date column into a string using as.character before you call your sql code, like so:

table$date2 = as.character(table$date)
sqldf("select max(date2), min(date2) from table")

This worked really well for me and I haven't run into any drawbacks yet.

from: https://ladvien.com/sqldf-dates/

krock
  • 483
  • 4
  • 11