2

I'm working through learning tidyverse principles and was wondering if there was an easier/better way to bring in my data that includes a datetime vector in a m/d/Y H:M:S AM/PM format. Currently I import with read_csv, which recognizes the column in character format, and then I use ludridate to create a new column using mdy_hms to parse the datetime column:

> test <- read_csv("data.csv")
Parsed with column specification:
cols(
  ActivityMinute = col_character(),
  Steps = col_integer()
)
> head(test)         
# A tibble: 6 x 2
     ActivityMinute Steps
              <chr> <int>
1 5/12/2016 12:00:00 AM     0
2 5/12/2016 12:01:00 AM     0
3 5/12/2016 12:02:00 AM     0
4 5/12/2016 12:03:00 AM     0
5 5/12/2016 12:04:00 AM     0
6 5/12/2016 12:05:00 AM     0

> test$datetime <- mdy_hms(test$ActivityMinute)
> head(test)
# A tibble: 6 x 3
     ActivityMinute Steps            datetime
              <chr> <int>              <dttm>
1 5/12/2016 12:00:00 AM     0 2016-05-12 00:00:00
2 5/12/2016 12:01:00 AM     0 2016-05-12 00:01:00
3 5/12/2016 12:02:00 AM     0 2016-05-12 00:02:00
4 5/12/2016 12:03:00 AM     0 2016-05-12 00:03:00
5 5/12/2016 12:04:00 AM     0 2016-05-12 00:04:00
6 5/12/2016 12:05:00 AM     0 2016-05-12 00:05:00

Is there a better way to do this, perhaps using cols()? I tried specifying the ActivityMinute as col_datetime, but that didn't work. Any tips for better code/process are appreciated.

  • `read.table` has an option for `colClasses`. – R. Schifini Jun 08 '17 at 03:07
  • 1
    @R.Schifini - admittedly it takes a bit more work when you have to sepcify a class conversion function like https://stackoverflow.com/a/13022441/496803 – thelatemail Jun 08 '17 at 03:19
  • @thelatemail - yes, as I mentioned in the comment below, I tried this but the times weren't parsed correctly. I guess I was wondering if there was a more simplistic way to pass lubridate functions into the import step. – Ernesto Ramirez Jun 08 '17 at 03:24
  • 1
    Just to note that RStudio makes this process a little easier: use "Import Dataset", then you can interactively choose the column type and experiment with the format until it looks right. And then the code is generated and can be saved for next time. – neilfws Jun 08 '17 at 03:27
  • 1
    @neilfws thanks for this tip. I just tested with @Sraffa's solution below (with `%I` instead of `%H` and it worked great. – Ernesto Ramirez Jun 08 '17 at 03:34

1 Answers1

5

You have to set the datetime format in the col_datetime call:

test <- 
 read_csv(
  "data.csv", 
   col_types = cols(
    ActivityMinute = col_datetime("%m/%d/%Y %I:%M:%S %p"), 
    Steps = col_integer()
   )
  )
Sraffa
  • 1,658
  • 12
  • 27