0
d$Accessed.Time<-strptime(d$accessed_at,format="%Y-%m-%d %H:%M:%S")
d$Counselor.Added.Time<-strptime(d$counselor_added_at,format="%Y-%m-%d %H:%M:%S")
d$logtime<-as.numeric(d$Accessed.Time-d$Counselor.Added.Time,units="days")
View(d[which(is.na(d$logtime)),
c("accessed_at","Accessed.Time","counselor_added_at","Counselor.Added.Time","logtime")])

First I converted d$accessed_at and d$counselor_added_at to R Datetime variable and performed an arithmetic operation on it and stored it in d$logtime. The weirdest thing is that R will treat certain d$Counselor.Added.Time as NA even though they are converted successfully.

R treats my datetime variable as NA even though it's converted successfully

The above screenshot is of that last View statement in R

is.na() will return TRUE for Counselor.Added.Time for all these observations and then having arithmetic operation fail on them even though they appear to be converted successfully.

Does anyone know what's going on?

it appears that this error is specific to these specific times

I tried this: a<-strptime("2015-03-08 02:33:07",format="%Y-%m-%d %H:%M:%S") and is.na(a) returned TRUE

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Harry
  • 13
  • 2
  • 2
    Please provide a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – Heroka Dec 16 '15 at 19:44
  • It appears to be specific to those specific times. – Harry Dec 16 '15 at 19:51
  • Try this: ```a<-strptime("2015-03-08 02:33:07",format="%Y-%m-%d %H:%M:%S")``` ```is.na(a)``` – Harry Dec 16 '15 at 19:52
  • 1
    I don't know how that's supposed to help me. Can you post some of your data (both with and without the issue), preferably as the output of dput. – Heroka Dec 16 '15 at 19:55
  • Due to daylight savings time, "2015-03-08 02:33:07" didn't exist in many timezones. If you specify `tz = "GMT"` in your `strptime` the time then is not missing. – Gregor Thomas Dec 16 '15 at 20:22
  • Thank you so much Gregor! That's exactly the issue! – Harry Dec 16 '15 at 21:12

1 Answers1

1

You can get confusing behaviour with the change to and from daylight saving time.
For example, in Melbourne, Australia, the time 2:30 am on 7 Oct, 2012 doesn’t exist because clocks where moved forward one hour from 2 am to 3 am. R will return NA if we attempt to use that time.

ISOdatetime(2012,10,7,2,30,0, tz='Australia/Melbourne')

[1] NA

The behaviour of strptime is interesting, the conversion is done, the value looks ok but it's actually missing.

x <- strptime('2012-10-7 2:30:0',format="%Y-%m-%d %H:%M:%S", tz='Australia/Melbourne')
x
#[1] "2012-10-07 02:30:00"
is.na(x)
#[1] TRUE
as.numeric(x)
# NA

Lets try a time that does exist

x <- strptime('2012-10-7 3:30:0',format="%Y-%m-%d %H:%M:%S", tz='Australia/Melbourne')
x
#[1] "2012-10-07 03:30:00 AEDT"
is.na(x)
# [1] FALSE

These problems go away if you specify the timezone as UTC

x <- strptime('2012-10-7 2:30:0',format="%Y-%m-%d %H:%M:%S", tz='UTC')
#[1] "2012-10-07 02:30:00 UTC"

The other tricky thing is that the time zone can be taken, by default, from whatever time zone your computer is using. Code might work or not work depending on whether you run it during summer (daylight saving time) or winter. Safest to always specify the time zone rather than rely on a default value.

Tony Ladson
  • 3,539
  • 1
  • 23
  • 30