3

Using an example from a related issue: nearest month end in R

library(lubridate)
library(dplyr)

dt<-data.frame(orig_dt=as.Date(c("1997-04-01","1997-06-29")))
dt %>% mutate(round_dt=round_date(orig_dt, unit="month"),
              modified_dt=round_date(orig_dt, unit="month")-days(1))

in one session I correctly get the rounded dates (R 4.0.0, Rcpp_1.0.4.6 loaded via a namespace)

     orig_dt   round_dt modified_dt
1 1997-04-01 1997-04-01  1997-03-31
2 1997-06-29 1997-07-01  1997-06-30

in another session I get floor instead of round (different machine, R 4.0.2, Rcpp not loaded via a namespace)

     orig_dt   round_dt modified_dt
1 1997-04-01 1997-04-01  1997-03-31
2 1997-06-29 1997-06-01  1997-05-31

I think it could be related to Rcpp , as earlier I got an error message

Error in C_valid_tz(tzone) (rscrpt.R#27): function 'Rcpp_precious_remove' not provided by package 'Rcpp'
Show stack trace

Although I am not getting the error anymore, the values are different and I wonder why/how to fix it without going through complete reinstallation.

Manuela R.
  • 63
  • 7
  • it may be related to https://stackoverflow.com/questions/68416435/rcpp-package-doesnt-include-rcpp-precious-remove. You could reinstall Rcpp and see. – user20650 Dec 04 '21 at 12:33
  • Thank you! `Rcpp` also made issues which turned out to be unrelated to the `round_date` issue, as answered below. I also updated `Rcpp` now, so should be all fixed! – Manuela R. Dec 04 '21 at 14:18

1 Answers1

2

I am able to reproduce your issue in a vanilla R session.

$ R --vanilla
> packageVersion("lubridate")
[1] ‘1.8.0’
> library("lubridate")
> round_date(x = as.Date("1997-06-29"), unit = "month")
[1] "1997-06-01"

It seems to be a bug in round_date, introduced in this commit. Prior to the commit, the body of round_date contained:

above <- unclass(as.POSIXct(ceiling_date(x, unit = unit, week_start = week_start)))
mid <- unclass(as.POSIXct(x))
below <- unclass(as.POSIXct(floor_date(x, unit = unit, week_start = week_start)))

Here, below, mid, and above are defined as the number of seconds from 1970-01-01 00:00:00 UTC to the month-floor of x, x, and the month-ceiling of x, respectively (more precisely, time 00:00:00 on those three Dates, in your system's time zone). Thus, below < mid < above, and round_date would compare mid-below to above-mid to determine which of below and above was closer to mid.

Since the commit, mid has been defined as

mid <- unclass(x)

which is the number of days from 1970-01-01 to x. Now, mid << below < above, making mid-below negative and above-mid positive. As a result, round_date considers below to be "closer" to mid than above, and it incorrectly rounds 1997-06-29 down to 1997-06-01.

I have reported the regression to the package maintainers here. I imagine that it will be fixed soon...

In the mean time, you can try reverting to an older version of lubridate, from before the commit, or using this temporary work-around:

round_date_patched <- function(x, unit) {
  as.Date(round_date(as.POSIXct(x), unit = unit))
}
round_date_patched(x = as.Date("1997-06-29"), unit = "month") # "1997-07-01"
Mikael Jagan
  • 9,012
  • 2
  • 17
  • 48
  • 1
    Thank you, for the detailed explanation, that is very helpful! - and for reporting the issue! Using an earlier `lubridate` version solved the issue (via `install_version("lubridate", version = "1.7.9", repos = "http://cran.us.r-project.org")`). Hope it gets fixed soon! – Manuela R. Dec 04 '21 at 14:10