2

Good day,

I have two date columns in as.POSIXct in the format YYYY-MM-DD HH:MM:SS. I would like to get the difference between the two, displayed in the format Days Hours:Seconds. Here is some dummy data:

    a<-c("2018-03-20 11:52:25 AST", "2018-03-20 12:51:25 AST", "2018-03-20 14:19:04 AST",
"2018-03-21 14:12:12 AST", "2018-03-21 12:09:22 AST", "2018-03-21 15:28:01 AST")

b<-c("2018-04-09 18:39:38 AST", "2018-06-23 19:13:14 AST", "2018-03-20 23:23:03 AST",
     "2018-05-10 21:29:28 AST", "2018-03-22 03:17:23 AST", "2018-05-12 00:19:39 AST")

ab<-data.frame(a,b)

Which gives this data frame:

                        a                       b
 2018-03-20 11:52:25 AST 2018-04-09 18:39:38 AST
 2018-03-20 12:51:25 AST 2018-06-23 19:13:14 AST
 2018-03-20 14:19:04 AST 2018-03-20 23:23:03 AST
 2018-03-21 14:12:12 AST 2018-05-10 21:29:28 AST
 2018-03-21 12:09:22 AST 2018-03-22 03:17:23 AST
 2018-03-21 15:28:01 AST 2018-05-12 00:19:39 AST

I would like to get the difference between a and b, or subtract time a from time b to get an output of X days X hours: X seconds.

I have used difftime below, along with the units set differently:

ab$time_difference<-difftime(ab$b, ab$a)
ab
                            a                       b   time_difference
     2018-03-20 11:52:25 AST 2018-04-09 18:39:38 AST  486.786944 hours
     2018-03-20 12:51:25 AST 2018-06-23 19:13:14 AST 2286.363611 hours
     2018-03-20 14:19:04 AST 2018-03-20 23:23:03 AST    9.066389 hours
     2018-03-21 14:12:12 AST 2018-05-10 21:29:28 AST 1207.287778 hours
     2018-03-21 12:09:22 AST 2018-03-22 03:17:23 AST   15.133611 hours
     2018-03-21 15:28:01 AST 2018-05-12 00:19:39 AST 1232.860556 hours

I have also tried the following:

ab$time_difference<-difftime(ab$b, ab$a,units=c("days","hours","seconds"))

But get the error that 'units' must be a length of 1. Is there a different command I should be using, or is there any way for difftime to produce a more exact time difference?

Thanks!

4 Answers4

3

The hms library can provide some assistance here:

library(hms)
as.hms(ab$time_difference, format="%H:%M:S")
# 486:47:13
# 2286:21:49
# 09:03:59
# 1207:17:16
# 15:08:01
# 1232:51:38

See this question for other options: Outputting difftime as HH:MM:SS:mm in R

Here is the code from the above answer for your issue:

Fmt <- function(x) UseMethod("Fmt")
Fmt.difftime <- function(x) {
   units(x) <- "secs"
   x <- unclass(x)
   NextMethod()
}
Fmt.default <- function(x) {
   y <- abs(x)
   sprintf("%s%02d:%02d:%02d:%02d", 
           ifelse(x < 0, "-", ""), # sign
           y %/% 86400,  # days
           y %% 86400 %/% 3600,  # hours 
           y %% 3600 %/% 60,  # minutes
           y %% 60 %/% 1) # seconds
}


a<-c("2018-03-20 11:52:25 AST", "2018-03-20 12:51:25 AST", "2018-03-20 14:19:04 AST",
     "2018-03-21 14:12:12 AST", "2018-03-21 12:09:22 AST", "2018-03-21 15:28:01 AST")

b<-c("2018-04-09 18:39:38 AST", "2018-06-23 19:13:14 AST", "2018-03-20 23:23:03 AST",
     "2018-05-10 21:29:28 AST", "2018-03-22 03:17:23 AST", "2018-05-12 00:19:39 AST")
ab<-data.frame(a,b)

#Passing two dates to  the function(s)
Fmt(as.POSIXct(ab$b)-as.POSIXct(ab$a))
#Passing a time difference in seconds
Fmt(difftime(ab$b, ab$a, units="secs"))

The key here is run the code for the function definitions at the start of your script so that the functions are then available for use.

Dave2e
  • 22,192
  • 18
  • 42
  • 50
  • Thanks, this works by reformatting the time difference column into hours:minutes:seconds, but I cannot get it to format as days as well. I also looked into the question that you linked, and tried to use the function but don't quite understand how to use it, and couldn't get it to reformat my data column. – Elizabeth Smith Mar 25 '19 at 15:13
  • Thanks for the response and edit @Dave2e. I have run the code, but keep getting the error message "Error in `*tmp*`$diffs : $ operator is invalid for atomic vectors". I tried is.atomic and got TRUE for each column. According to this question https://stackoverflow.com/questions/23299684/r-error-in-xed-operator-is-invalid-for-atomic-vectors I have to use Fmt(as.POSIXct(ab["b"])-as.POSIXct(ab["a"]), however I then get the error " do not know how to convert 'tag_final["last_pos"]' to class “POSIXct”. When I do str(ab) it says these columns are in POSIXct. What am I missing here? – Elizabeth Smith Mar 25 '19 at 18:19
  • If the columns are class POSIXct, then they are all ready a date time object and you should use the `difftime` function. If you are quoting the column names you will need to used double brackets `Fmt(as.POSIXct(ab[["b"]])-as.POSIXct(ab[["a"]])` – Dave2e Mar 25 '19 at 19:07
2

Since you would like days, hours, minutes, seconds, we can get this result with the lubridate package:

a<-c("2018-03-20 11:52:25 AST", "2018-03-20 12:51:25 AST", "2018-03-20 14:19:04 AST",
 "2018-03-21 14:12:12 AST", "2018-03-21 12:09:22 AST", "2018-03-21 15:28:01 AST")

b<-c("2018-04-09 18:39:38 AST", "2018-06-23 19:13:14 AST", "2018-03-20 23:23:03 AST",
 "2018-05-10 21:29:28 AST", "2018-03-22 03:17:23 AST", "2018-05-12 00:19:39 AST")

a = as.POSIXct(a)
b = as.POSIXct(b)

library(lubridate)
timespan = interval(ymd_hms(ab[,1]), ymd_hms(ab[,2]))
> as.period(timespan)
[1] "20d 6H 47M 13S"    "3m 3d 6H 21M 49S"  "9H 3M 59S"         "1m 19d 7H 17M 16S"
[5] "15H 8M 1S"         "1m 20d 8H 51M 38S"

If desired, we can convert months to days by specifying the formatting as follows:

> as.period(timespan, unit = "day")
[1] "20d 6H 47M 13S" "95d 6H 21M 49S" "9H 3M 59S"      "50d 7H 17M 16S"
[5] "15H 8M 1S"      "51d 8H 51M 38S"
NM_
  • 1,887
  • 3
  • 12
  • 27
  • Using a `period` object for this seems like a less than ideal idea - "*Because periods do not have a fixed length, [they can not be accurately converted to and from Duration objects](https://www.rdocumentation.org/packages/lubridate/versions/1.7.4/topics/as.period)*" – thelatemail Mar 24 '19 at 22:13
  • Thank you. The second value in the output has 3m and 21 M... Does the lower-case m stand for miliseconds? If so, is there a way to remove miliseconds? – Elizabeth Smith Mar 25 '19 at 18:29
  • The lowercase “m” stands for months (since the difference between the dates is so large). – NM_ Mar 25 '19 at 18:38
  • Ah, I see, thank you. Since the days in months vary, is there any way to get the months to just default to days? So my output would just be Days, Hours, Minutes, Seconds? – Elizabeth Smith Mar 25 '19 at 18:54
  • @ElizabethSmith : Yes we can change months to days by specifying the formatting. I have edited my answer and added a little bit of code at the end to do this. – NM_ Mar 25 '19 at 19:14
  • Thank you so much! This is exactly what I'm looking for! – Elizabeth Smith Mar 25 '19 at 19:18
  • @ElizabethSmith : Glad I was able to help :) – NM_ Mar 25 '19 at 19:24
1
require(lubridate)

a<-c("2018-03-20 11:52:25 AST", "2018-03-20 12:51:25 AST", "2018-03-20 14:19:04 AST",
     "2018-03-21 14:12:12 AST", "2018-03-21 12:09:22 AST", "2018-03-21 15:28:01 AST")

b<-c("2018-04-09 18:39:38 AST", "2018-06-23 19:13:14 AST", "2018-03-20 23:23:03 AST",
     "2018-05-10 21:29:28 AST", "2018-03-22 03:17:23 AST", "2018-05-12 00:19:39 AST")

# Make df
ab <- data.frame(a = as.POSIXct(a),b = as.POSIXct(b),stringsAsFactors = FALSE)

# Time diff
ab$time_difference <- ab$b - ab$a
ab$time_difference <- as.duration(ab$time_difference)
ab$time_difference 

1 2018-03-20 11:52:25 2018-04-09 18:39:38   1752433s (~2.9 weeks)
2 2018-03-20 12:51:25 2018-06-23 19:13:14 8230909s (~13.61 weeks)
3 2018-03-20 14:19:04 2018-03-20 23:23:03    32639s (~9.07 hours)
4 2018-03-21 14:12:12 2018-05-10 21:29:28  4346236s (~7.19 weeks)
5 2018-03-21 12:09:22 2018-03-22 03:17:23   54481s (~15.13 hours)
6 2018-03-21 15:28:01 2018-05-12 00:19:39  4438298s (~7.34 weeks)
Andrew Bannerman
  • 1,235
  • 2
  • 16
  • 36
1

Use sprintf and modular arithmetic:

# first, be sure to specify units in difftime, or it will internally
#   choose units for each row
# using 'secs' here since it's the lowest common denominator
# wrapping as.double() to remove the class attribute which will
#   screw up dispatch to Ops below
ab$time_difference <- as.double(difftime(ab$b, ab$a, units = 'secs'))

#  3600 =   60*60 seconds in an hour;
# 86400 = 3600*24 seconds in a day
ab$hms = with(ab, sprintf('%d days; %d hours; %d seconds',
                          time_difference %/% 86400L,
                          (time_difference %% 86400L) %/% 3600L,
                          time_difference %% 3600L))
ab$hms
# [1] "20 days; 6 hours; 2833 seconds" "95 days; 6 hours; 1309 seconds"
# [3] "0 days; 9 hours; 239 seconds"   "50 days; 7 hours; 1036 seconds"
# [5] "0 days; 15 hours; 481 seconds"  "51 days; 8 hours; 3098 seconds"

I chose a particularly verbose output format just for illustration; the building blocks are here to roll your own, of course, keeping in mind that you should replace %d with %02d (e.g.) to left-0-pad the output to 2 digits.

MichaelChirico
  • 33,841
  • 14
  • 113
  • 198