1

This question is related to this Group by id and drug (with dates <100 days of each other) take the earliest and latest date

The dataset is:

mydata = data.frame (Id =c(1,1,1,1,1,1,1,1,1,1),
                     Date = c("2000-01-01","2000-01-05","2000-02-02", "2000-02-12", 
                              "2000-02-14","2000-05-13", "2000-05-15", "2000-05-17", 
                              "2000-05-16", "2000-05-20"),
                     drug = c("A","A","B","B","B","A","A","A","C","C"))

   Id       Date drug
1   1 2000-01-01    A
2   1 2000-01-05    A
3   1 2000-02-02    B
4   1 2000-02-12    B
5   1 2000-02-14    B
6   1 2000-05-13    A
7   1 2000-05-15    A
8   1 2000-05-17    A
9   1 2000-05-16    C
10  1 2000-05-20    C

With this code:

library(lubridate)
library(dplyr)

mydata %>% 
  group_by(Id, drug) %>% 
  mutate(Date = ymd(Date),
         Diff = as.numeric(Date - lag(Date, default = Date[1])),
         startDate = min(Date, na.rm = T),
         endDate = max(Date, na.rm = T),
         startDate =  ifelse(Diff > 100, Date, startdate)
         )

      Id Date       drug   Diff startDate endDate   
   <dbl> <date>     <chr> <dbl>     <dbl> <date>    
 1     1 2000-01-01 A         0     17257 2000-05-17
 2     1 2000-01-05 A         4     17257 2000-05-17
 3     1 2000-02-02 B         0     17257 2000-02-14
 4     1 2000-02-12 B        10     17257 2000-02-14
 5     1 2000-02-14 B         2     17257 2000-02-14
 6     1 2000-05-13 A       129     11090 2000-05-17
 7     1 2000-05-15 A         2     17257 2000-05-17
 8     1 2000-05-17 A         2     17257 2000-05-17
 9     1 2000-05-16 C         0     17257 2000-05-20
10     1 2000-05-20 C         4     17257 2000-05-20

the startDate column changes at the last line the class from date to double and I don't understand why.

I have tried origin= "1970-01-01, as.Date, ymd ...

So my question is why does this happen?

TarJae
  • 72,363
  • 6
  • 19
  • 66
  • 1
    Typo in the FALSE condition? It says startdate, not startDate. – deschen Apr 27 '22 at 22:39
  • Ok. It is time to sleep now. :-) many thanks @deschen! – TarJae Apr 27 '22 at 22:45
  • 1
    Well... correcting the typo still creates a double, or am I mistaken? If you change `ifelse` into the more strict `if_else` the output should be a `Date`. – Martin Gal Apr 27 '22 at 22:57
  • I completely agree with @Martin Gal. Retrospectively maybe this is question is obvious for some users, but I really invested much time to get a grip. And after a good sleep. I remembered the issue with `ifelse` as I have studied Hadley Wickhams `dplyr` lectures. But at that time it was worth to ask! – TarJae Apr 28 '22 at 08:01

1 Answers1

3

The reason for ifelse() changing the class from date to double is documented in help("ifelse"):

The mode of the result may depend on the value of test (see the examples), and the class attribute (see oldClass) of the result is taken from test and may be inappropriate for the values selected from yes and no.

Perhaps, dplyr::if_else() might be more appropriate here:

mydata %>% 
  group_by(Id, drug) %>% 
  mutate(Date = lubridate::ymd(Date),
         Diff = as.numeric(Date - lag(Date, default = Date[1])),
         startDate = min(Date, na.rm = T),
         endDate = max(Date, na.rm = T),
         startDate =  if_else(Diff > 100, Date, startDate)
  )

returns

# A tibble: 10 × 6
# Groups:   Id, drug [3]
      Id Date       drug   Diff startDate  endDate   
   <dbl> <date>     <fct> <dbl> <date>     <date>    
 1     1 2000-01-01 A         0 2000-01-01 2000-05-17
 2     1 2000-01-05 A         4 2000-01-01 2000-05-17
 3     1 2000-02-02 B         0 2000-02-02 2000-02-14
 4     1 2000-02-12 B        10 2000-02-02 2000-02-14
 5     1 2000-02-14 B         2 2000-02-02 2000-02-14
 6     1 2000-05-13 A       129 2000-05-13 2000-05-17
 7     1 2000-05-15 A         2 2000-01-01 2000-05-17
 8     1 2000-05-17 A         2 2000-01-01 2000-05-17
 9     1 2000-05-16 C         0 2000-05-16 2000-05-20
10     1 2000-05-20 C         4 2000-05-16 2000-05-20
Uwe
  • 41,420
  • 11
  • 90
  • 134
  • 1
    A good reference (and likely dupe-link) for `ifelse`-class issues is https://stackoverflow.com/q/6668963/3358272 – r2evans Apr 27 '22 at 23:00