4

I am trying to convert a character string into a POSIXct date format and running into a problem with the time zone information.

The original character data looks like this:

SD$BGN_DTTM
[1] "1956-05-25 14:30:00 CST" "1956-06-05 16:30:00 CST" "1956-07-04 15:30:00 CST"
[4] "1956-07-08 08:00:00 CST" "1956-08-19 12:00:00 CST" "1956-12-23 00:50:00 CST"

but when I attempt to convert using as.POSIXct , this happens:

SD$BGN_DTTM <- as.POSIXct(SD$BGN_DTTM)
[1] "1956-05-25 14:30:00 PDT" "1956-06-05 16:30:00 PDT" "1956-07-04 15:30:00 PDT"
[4] "1956-07-08 08:00:00 PDT" "1956-08-19 12:00:00 PDT" "1956-12-23 00:50:00 PST"

It looks like the function isn't reading the time zone I've specified. Since my computer is on PDT, it looks like it has used that instead. Note also that it has appended PST to the last date (seems odd). Can anyone tell me what is going on here, and whether there is a method to get R to read the time zone information as shown?

Jen H
  • 43
  • 4
  • A small comment. CST, as an ISO timezone, stands for China Standard Time. People in US would just treat CST as central standard time -- yet this abbreviation is not standard. So if R recognizes timezone in this case, it would still give you surprising results. – Gang Liang May 27 '19 at 08:12

2 Answers2

6

This would still have the problem you noticed with daylight/standard times:

> strptime(test, format="%Y-%m-%d %H:%M:%S", tz="America/Chicago")
[1] "1956-05-25 14:30:00 CDT" "1956-06-05 16:30:00 CDT"
[3] "1956-07-04 15:30:00 CDT" "1956-07-08 08:00:00 CDT"
[5] "1956-08-19 12:00:00 CDT" "1956-12-23 00:50:00 CST"

The strptime function refuses to honor the "%Z" format for input (which in its defense is documented.) Many people have lost great gobs of hair and probably some keyboards into monitors in efforts to get R timezones working to their (dis?)satisfaction.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
3

As we all know, time is a relative thing. Storing time as UTC/GMT or relative to UTC/GMT will make sure that daylight savings etc only come into play when you want them to, as per: Does UTC observe daylight saving time?

So, if:

x <- c("1956-05-25 14:30:00 CST","1956-06-05 16:30:00 CST", "1956-07-04 15:30:00 CST",
"1956-07-08 08:00:00 CST", "1956-08-19 12:00:00 CST","1956-12-23 00:50:00 CST")

You can find out that CST is 6 hours behind UTC/GMT (as opposed to CDT, which is daylight savings time and is 7 hours behind)
Therefore:

out <- as.POSIXct(x,tz="ETC/GMT+6")

will represent CST without any daylight savings shift to CDT. That way when or if you convert to local central timezones, the proper CST time will be returned without changing the actual data for daylight savings. (i.e. - when R prints CDT, it is only shifting the display of the time forward an hour, but the underlying numerical data is not changed. The last case displays as expected when standard time kicks back in):

attr(out,"tzone") <- "America/Chicago"
out
#[1] "1956-05-25 15:30:00 CDT" "1956-06-05 17:30:00 CDT" "1956-07-04 16:30:00 CDT"
#[4] "1956-07-08 09:00:00 CDT" "1956-08-19 13:00:00 CDT" "1956-12-23 00:50:00 CST"

I.e. - for case 1, 15:30 CDT == 14:30 CST - as originally specified, and when daylight savings stops, for case 6, 00:50 CST == 00:50 CST as originally specified.

Comparing this final out to the other answer, you can see there is an actual numerical time difference of one hour for all the daylight savings cases:

out - strptime(x, format="%Y-%m-%d %H:%M:%S", tz="America/Chicago")
#Time differences in secs
#[1] 3600 3600 3600 3600 3600    0
Community
  • 1
  • 1
thelatemail
  • 91,185
  • 12
  • 128
  • 188
  • That looked as though it might be an answer, but when I dug in it seem less unsatisfying. You actually changed the question to have all the input times in CST rather than dealing with the timezone shift. In the summer months Chicago is 7 hours different than GMT. – IRTFM Mar 20 '15 at 05:18
  • @BondedDust - I didn't change anything. That's the original character data from OP. R deals with the timezone shifts when changing from GMT to Chicago time - the underlying data stays the same. As opposed to the result of your answer where 2:30 CST changes to 2:30 CDT, which is actually not the same point-in-time. – thelatemail Mar 20 '15 at 05:34
  • I guess it's a small comfort to know better minds have struggled with the issue. @thelatemail - your explanation and example are very useful. The raw dataset actually contains many time zones - these were date/times of events recorded on a local clock. I think it is best to create a new column, load in the date/time, and then go back and apply the appropriate timezones as you have shown - so essentially I have them all in UTC. Thanks to both of you! – Jen H Mar 21 '15 at 22:17
  • @42: Why 7 hours behind GMT? Wikipedia states that CST = UTC-06:00 (in winters), while CDT = UTC-05:00 (in summers), both representing CT zone (where Chicago is): https://en.wikipedia.org/wiki/Time_in_the_United_States#Standard_time_and_daylight_saving_time – Oleg Melnikov Dec 14 '15 at 05:01