0

I'm trying to convert data$TIME (format: 06:07:11 PM) to hours (18 in this case) so that I can count the number of inquiries (calls) for each hour to plot a histogram or barchart.

Can someone please help me?

data = read.csv('X.csv')
data$TIME <- format(strptime(data$TIME, "%H:%M:%S %p"), format="%H:%M:%S")

but the code above is returning NAs...

  • 1
    Hi, and welcome to SO! Please take a look at [how to ask](https://stackoverflow.com/help/how-to-ask) and especially [how to make a great reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). In particular, a data sample so we can see what `data$TIME` looks like is needed so that we can help you debug. – Calum You Jul 17 '19 at 15:57
  • after above code, data$TIME returns [1] NA NA NA NA NA NA – wakanda_flava Jul 17 '19 at 16:00
  • How about hour() function from lubridate package? – denisafonin Jul 17 '19 at 16:06
  • hour(data$TIME) – denisafonin Jul 17 '19 at 16:07
  • @Denis hour(data$TIME) results in 'Error in as.POSIXlt.numeric(x, tz = tz(x)) : 'origin' must be supplied' – wakanda_flava Jul 17 '19 at 17:19

2 Answers2

2

Use strptime and then pick off the hour component. Note that when using %p the hour must be denoted by %I and not by %H.

strptime("07/01/2019 06:07:11 PM", "%m/%d/%Y %I:%M:%S %p")$hour
## [1] 18

Update

The poster changed the question after it was answered so the answer to the revised question is:

strptime("06:07:11 PM", "%I:%M:%S %p")$hour
## [1] 18
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • Thanks for this but I need to use times stored in data$TIME. Can you help me write this using data$TIME? I tried doing data$TIME <- strptime(data$TIME, "%m/%d/%Y %I:%M:%S %p")$hour but this returns NAs – wakanda_flava Jul 17 '19 at 17:13
  • Note that the question explicitly states by example that the format includes a date and not just a time; however, if there were just a time the format would have to specify that and be just: `"%I:%M:%S %p"` – G. Grothendieck Jul 17 '19 at 18:56
  • @ G. Grothendieck sorry I edited the question. data$TIME actually includes only time (H:MM:SS XM) I tried using strptime(data$TIME, "%m/%d/%Y %I:%M:%S %p")$hour and it returns whole bunch NAs and no numbers – wakanda_flava Jul 17 '19 at 19:16
  • The format must correspond to the data. As per my last comment if there is no date in the data you must NOT include the date part of the format. See Update. – G. Grothendieck Jul 17 '19 at 22:21
1

You can use lubridate and dplyr to solve this. You can parse the date using lubridate then extract the hour using the hour() function. Then count the number of observations with that specified hour.

library(lubridate)
library(dplyr)


# create sample date to parse
sample_date <- "07/01/2019 06:07:11 PM"

# parse date using lubridate 
# stands for day-month-year hour-minute-sconds
parsed_date <- dmy_hms(sample_date) 

# generate sample date times
sample_dates <- seq(from = parsed_date, to = parsed_date + days(10), length.out = 240)

tibble(dates = sample_dates) %>% 
  mutate(hour = hour(dates)) %>% 
  group_by(hour) %>% 
  summarise(n_hour = n())

#> # A tibble: 24 x 2
#>     hour n_hour
#>    <int>  <int>
#>  1     0     10
#>  2     1     10
#>  3     2     10
#>  4     3     10
#>  5     4     10
#>  6     5     10
#>  7     6     10
#>  8     7     10
#>  9     8     10
#> 10     9     10
#> # … with 14 more rows

Created on 2019-07-17 by the reprex package (v0.2.1)

thus__
  • 460
  • 3
  • 16
  • Based on the change to this question, you can parse the time using `lubridate::hms()` rather than `lubridate::mdy_hms()` and the rest still applies. – thus__ Jul 17 '19 at 20:10