5

Preface:

I have a column in a data.table of difftime values with units set to days. I am trying to create another data.table summarizing the values with

dt2 <- dt[, .(AvgTime = mean(DiffTime)), by = Group]

When printing the new data.table, I see values such as

1.925988e+00 days
1.143287e+00 days
1.453975e+01 days

I would like to limit the decimal place values for this column only (i.e. not setting options() unless I can do this specifically for difftime values this way). When I try to do this using the method above, modified, e.g.

dt2 <- dt[, .(AvgTime = round(mean(DiffTime)), 2), by = Group]

I am left with NA values, with both the base round() and format() functions returning the warning:

In mean(DiffTime) : argument is not numeric or logical.

Oddly enough, if I perform the same operation on a numeric field, this runs with no problems. Also, if I run the two separate lines of code, I can accomplish what I am looking to do:

dt2 <- dt[, .(AvgTime = mean(DiffTime)), by = Group]
dt2[, AvgTime := round(AvgTime, 2)]

Reproducible Example:

library(data.table)
set.seed(1)
dt <- data.table(
  Date1 = 
    sample(seq(as.Date('2017/10/01'), 
               as.Date('2017/10/31'), 
               by="days"), 24, replace = FALSE) +
    abs(rnorm(24)) / 10,
  Date2 = 
    sample(seq(as.Date('2017/10/01'), 
               as.Date('2017/10/31'), 
               by="days"), 24, replace = FALSE) +
    abs(rnorm(24)) / 10,
  Num1 =
    abs(rnorm(24)) * 10,
  Group = 
    rep(LETTERS[1:4], each=6)
)
dt[, DiffTime := abs(difftime(Date1, Date2, units = 'days'))]

# Warnings/NA:
class(dt$DiffTime) # "difftime"
dt2 <- dt[, .(AvgTime = round(mean(DiffTime), 2)), by = .(Group)]

# Works when numeric/not difftime:
class(dt$Num1) # "numeric"
dt2 <- dt[, .(AvgNum = round(mean(Num1), 2)), by = .(Group)]

# Works, but takes an additional step:
dt2<-dt[,.(AvgTime = mean(DiffTime)), by = .(Group)]
dt2[,AvgTime := round(AvgTime,2)]

# Works with base::mean:
class(dt$DiffTime) # "difftime"
dt2 <- dt[, .(AvgTime = round(base::mean(DiffTime), 2)), by = .(Group)]

Question:

Why am I not able to complete this conversion (rounding of the mean) in one step when the class is difftime? Am I missing something in my execution? Is this some sort of bug in data.table where it can't properly handle the difftime?

Issue added on github.

Update: Issue appears to be cleared after updating from data.table version 1.10.4 to 1.12.8.

Gaffi
  • 4,307
  • 8
  • 43
  • 73

3 Answers3

3

This was fixed by update #3567 on 2019/05/15, data.table version 1.12.4 released 2019/10/03

Gaffi
  • 4,307
  • 8
  • 43
  • 73
1

This might be a little late but if you really want it to work you can do:

as.numeric(round(as.difftime(difftime(DATE1, DATE2)), 0))
Alexis Drakopoulos
  • 1,115
  • 7
  • 22
  • This has been fixed with an update to the `data.table` package. See my new answer/update. – Gaffi Feb 03 '20 at 14:24
1

I recently ran into the same problem using data.table_1.11.8. One quick work around is to use base::mean instead of mean.

Matt Motoki
  • 86
  • 1
  • 5
  • This has been fixed with an update to the `data.table` package. See my new answer/update. – Gaffi Feb 03 '20 at 14:24