R: How to handle times without dates?

Question

I have data which includes Date as well as Time enter and Time exit. These latter two contain data like this: 08:02, 12:02, 23:45 etc.

I would like to manipulate the Time eXXX data - for example, substract Time enter from Time exit to work out duration, or plot the distributions of Time enter and Time exit, e.g. to see if most entries are before 10:00, or if most exits are after 17:00.

All the packages I've looked at require a date to precede the time, e.g. 01/02/2012 12:33.

Is this possible, or should I simply append an identical date to every time for the sake of calculations? This seem a bit messy!

Related: [Convert hour:minute:second (HH:MM:SS) string to proper time class](https://stackoverflow.com/questions/12034424/convert-hourminutesecond-hhmmss-string-to-proper-time-class) — Henrik, Dec 11 '21 at 12:50

score 20 · Accepted Answer · edited Feb 05 '19 at 20:49

Use the "times" class found in the chron package:

library(chron)

Enter <- c("09:12", "17:01")
Enter <- times(paste0(Enter, ":00"))

Exit <-  c("10:15", "18:11")
Exit <- times(paste0(Exit, ":00"))

Exit - Enter # durations

sum(Enter < "10:00:00") # no entering before 10am
mean(Enter < "10:00:00") # fraction entering before 10am

sum(Exit >  "17:00:00") # no exiting after 5pm
mean(Exit >  "17:00:00") # fraction exiting after 5pm

table(cut(hours(Enter), breaks = c(0, 10, 17, 24))) # Counts for indicated hours   
 ## (0,10] (10,17] (17,24] 
 ##      1       1       0 

table(hours(Enter))  # Counts of entries each hour
## 9 17 
## 1  1

stem(hours(Enter), scale = 2)
## The decimal point is at the |

##   9 | 0
##  10 | 
##  11 | 
##  12 | 
##  13 | 
##  14 | 
##  15 | 
##  16 | 
##  17 | 0

Graphics:

tab <- c(table(Enter), -table(Exit))  # Freq at each time.  Enter is pos; Exit is neg.
plot(times(names(tab)), tab, type = "h", xlab = "Time", ylab = "Freq")
abline(v = c(10, 17)/24, col = "red", lty = 2) # vertical red lines
abline(h = 0)  # X axis

wxxyyyzz · Answer 2 · 2019-04-04T16:36:05.763

Thanks for the feedback and sorry for the confusion I have edited it a bit to clarify.

New Edit:

First, chron package and strptime with fixed format both work well as demonstrated in other answers. I just want to introduce lubridate a little bit since it's easier to use, and flexible with time format.

Example data

df <- data.frame(TimeEnterChar = c(rep("07:58", 10), "08:02", "08:03", "08:05", "08:10", "09:00"),
                 TimeExitChar  = c("16:30", "16:50", "17:00", rep("17:02", 10), "17:30", "18:59"),
                 stringsAsFactors = F)

If all you want is to count how many entry time were later than 8:00, then you can compare the character directly. Below would should 5 entry time were later.

sum(df$TimeEnterChar > "08:00")

If you want more, personally, I like lubridate package when dealing with time data, especially timestamps with dates although it's not the focus of this post at all.

library(lubridate)
# Convert character to a "Period" class by lubridate, shows in form of H M S
df$TimeEnterTime <- hm(df$TimeEnterChar)
df$TimeExitTime  <- hm(df$TimeExitChar)
head(df)

sum(df$TimeEnterTime > hm("08:00"))

You can still compare the time.

A little more about using them as numeric: I assume only minute-level time is wanted. Thus, I divided number of seconds by 60 to get number of minutes.

df$DurationMinute <- as.numeric( df$TimeExitTime - df$TimeEnterTime )/60
hist(df$DurationMinute, breaks = seq(500, 600, 5))

head(df)
  TimeEnterChar TimeExitChar TimeEnterTime TimeExitTime DurationMinute
1         07:58        16:30     7H 58M 0S   16H 30M 0S            512
2         07:58        16:50     7H 58M 0S   16H 50M 0S            532
3         07:58        17:00     7H 58M 0S    17H 0M 0S            542
4         07:58        17:02     7H 58M 0S    17H 2M 0S            544
5         07:58        17:02     7H 58M 0S    17H 2M 0S            544
6         07:58        17:02     7H 58M 0S    17H 2M 0S            544

You can simply plot a histogram to see the distribution of time duration between entry and exit.

You can also look at the distribution of entry/exit time. But some effort is needed to convert the axis.

df$TimeEnterNumMin <- as.numeric(df$TimeEnterTime) / 60
df$TimeExitNumMin  <- as.numeric(df$TimeExitTime) / 60

hist(df$TimeEnterNumMin, breaks = seq(0, 1440, 60), xaxt = 'n', main = "Whole by 1hr")
axis(side = 1, at = seq(0, 1440, 60), labels = paste0(seq(0, 24, 1), ":00"))

hist(df$TimeEnterNumMin, breaks = seq(420, 600, 15), xaxt = 'n', main = "Morning by 15min")
axis(side = 1, at = seq(420, 600, 60), labels = paste0(seq(7, 10, 1), ":00"))

I did not polish the plot, nor make the axis flexible. Please do based on your needs. Hopefully, it helps.

Below is old useless post: (no need to read. kept so that comments don't look weird)

Came across a similar issue and was inspired by this post. @G. Grothendieck and @David Arenburg provided great answers for transforming the time.

For comparison, I feel forcing the time into numeric helps. Instead of comparing "11:22:33" with "9:00:00", comparing as.numeric(hms("11:22:33")) (which is 40953 seconds) and as.numeric(hms("9:00:00")) (32400) would be much easier.

as.numeric(hms("11:22:33")) > as.numeric(hms("9:00:00"))  &  as.numeric(hms("11:22:33")) < as.numeric(hms("17:00:00"))
[1] TRUE

The above example shows 11:22:33 is between 9AM and 5PM.

To extract just time from the date or POSIXct object, substr("2013-10-01 11:22:33 UTC", 12, 19) should work, although it looks stupid to change a time object to string/character and back to time again.

Converting the time to numeric should work for plotting as @G. Grothendieck descirbed. You can convert the numbers back to time as needed for x axis labels.

I don't see the point of your answer. `hms("11:22:33") > hms("9:00:00") & hms("11:22:33") < hms("17:00:00")` works nicely, why do you feel that wrapping each individual piece in `as.numeric` is better? — Gregor Thomas, Aug 20 '18 at 15:05
@Gregor You are absolutely right. I guess the reason why I did so initially was that I was thinking of converting it back to time format for plotting. Now I realize I can just use `hour(hms("11:22:33"))` instead of doing the math. Thanks! — wxxyyyzz, Aug 20 '18 at 15:16
I'd recommend deleting this answer. It doesn't seem to add anything useful---just distracts from the other answers. If I'm wrong and there is something useful here, then the answer should be edited to highlight that. — Gregor Thomas, Mar 28 '19 at 16:52

score 2 · Answer 3 · answered Mar 26 '14 at 11:53

2

Would something like that work?

SubstracTimes <-  function(TimeEnter, TimeExit){
  (as.numeric(format(strptime(TimeExit, format ="%H:%M"), "%H")) + 
  as.numeric(format(strptime(TimeExit, format ="%H:%M"), "%M"))/60) -
  (as.numeric(format(strptime(TimeEnter, format ="%H:%M"), "%H")) + 
   as.numeric(format(strptime(TimeEnter, format ="%H:%M"), "%M"))/60)
}

Testing:

TimeEnter <- "08:02"
TimeExit <- "12:02"
SubstracTimes(TimeEnter, TimeExit)
> SubstracTimes(TimeEnter, TimeExit)
[1] 4

answered Mar 26 '14 at 11:53

David Arenburg

91,361
17
137
196

This works for adding/subtracting, and returns decimal hours (i.e. 3.5 rather than 3:30). For plotting, perhaps I could manually assign values to bins in order to plot frequency of enter against time. – gisol Mar 26 '14 at 11:57
Maybe provide some example dataset and explain exactly what you want to plot and what is the expected result – David Arenburg Mar 26 '14 at 12:01

R: How to handle times without dates?

3 Answers3

Linked

Related