7

Here's some of my data, read in from a file names AttReport_all:

Registration.Date                 Join.Time                Leave.Time
1 Jul 05, 2011 09:30 PM EDT Jul 07, 2011 01:05 PM EDT Jul 07, 2011 01:53 PM EDT
2 Jul 05, 2011 10:20 AM EDT Jul 07, 2011 01:04 PM EDT Jul 07, 2011 01:53 PM EDT
3 Jul 04, 2011 02:41 PM EDT Jul 07, 2011 12:49 PM EDT Jul 07, 2011 01:53 PM EDT
4 Jul 04, 2011 11:38 PM EDT Jul 07, 2011 12:49 PM EDT Jul 07, 2011 01:54 PM EDT
5 Jul 05, 2011 11:41 AM EDT Jul 07, 2011 12:54 PM EDT Jul 07, 2011 01:54 PM EDT
6 Jul 07, 2011 11:08 AM EDT Jul 07, 2011 01:16 PM EDT Jul 07, 2011 01:53 PM EDT

If I do strptime(AttReport_all$Registration.Date, "%b %m, %Y %H:%M %p", tz="") I get an array of NAs where I'm expecting dates.

Sys.setlocale("LC_TIME", "C") returns "C"

typeof(AttReport_all$Registration.Date) returns "integer"

is.factor(AttReport_all$Registration.Date) returns TRUE.

What am I missing?

Here's version output, if it helps: platform i386-pc-mingw32
arch i386
os mingw32
system i386, mingw32
status
major 2
minor 13.0
year 2011
month 04
day 13
svn rev 55427
language R
version.string R version 2.13.0 (2011-04-13)

William Gunn
  • 2,925
  • 8
  • 26
  • 22

1 Answers1

10

strptime automatically runs as.character on the first argument (so it doesn't matter that it's a factor) and any trailing characters not specified in format= are ignored (so "EDT" doesn't matter).

The only issues are the typo @Ben Bolker identified (%m should be %d) and %H should be %I (?strptime says you should not use %H with %p).

# %b and %m are both *month* formats
strptime("Jul 05, 2011 09:30 PM EDT", "%b %m, %Y %H:%M %p", tz="")
# [1] NA

# change %m to %d and we no longer get NA, but the time is wrong (AM, not PM)
strptime("Jul 05, 2011 09:30 PM EDT", "%b %d, %Y %H:%M %p", tz="")
# [1] "2011-07-05 09:30:00"

# use %I (not %H) with %p
strptime("Jul 05, 2011 09:30 PM EDT", "%b %d, %Y %I:%M %p", tz="")
# [1] "2011-07-05 21:30:00"
Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
  • dstr <- gsub(" EDT$","",as.character(AttReport_all$Registration.Date)) and strptime(dstr, "%b %d, %Y %H:%M %p", tz="") worked, but I'm not sure what it was doing. It's possible my mistake was the %m/%H substitutions. – William Gunn Jul 13 '11 at 20:48
  • Seems like my mistake was the %m/%H substitutions. Thanks. – William Gunn Jul 13 '11 at 20:54
  • Using `%m` when you should have used `%d` caused the `NA`. Using `%H` with `%p` caused a more subtle error. – Joshua Ulrich Jul 13 '11 at 20:56