How to mutate new columns in R based on earliest and latest dates for other variables

Question

In a dataset where each patient had multiple test administrations and a score on each test date, I have to identify the earliest & latest test dates, then subtract the difference of the scores of those dates. I think I've identified the first & last dates through dplyr, creating new columns for those:

SplitDates <- SortedDates %>% 
  group_by(PatientID) %>% 
  mutate(EarliestTestDate = min(AdministrationDate), 
         LatestTestDate = max(AdministrationDate)) %>% 
  arrange(desc(PatientID))

Score column is TotalScore

Now how do I extract the scores from these 2 dates (for each patient) to create new columns of earliest & latest scores? Haven't been able to figure out a mutate with case_when or if_else to create a score based on a record with a certain date.

You can filter with a condition like `AdministrationDate == EarliestTestDate | AdministrationDate == LatestTestDate` or just `AdministrationDate %in% range(AdministrationDate)` — camille, Dec 18 '21 at 22:05
Also perhaps use mutate to create a 'earliest - latest' column, e.g. `SplitDates <- SortedDates %>% group_by(PatientID) %>% mutate(EarliestTestDate = min(AdministrationDate), LatestTestDate = max(AdministrationDate), earliest_minus_latest = EarliestTestDate - LatestTestDate) %>% arrange(desc(PatientID))` — jared_mamrot, Dec 18 '21 at 22:06
@jared_mamrot I think they're trying to get the difference of some score, not the dates themselves. Hard to say without a [reproducible example](https://stackoverflow.com/q/5963269/5325862) — camille, Dec 18 '21 at 22:08
These answers cover the 1st part of how to get the earliest/latest test dates. The 2nd part of my problem is: I have a numerical TotalScore column. I want to create new columns for EarliestTestScore & LatestTestScore by somehow taking the TotalScore from the cases for each patient's EarliestTestDate and LatestTestDate. — CodeMonkey, Dec 18 '21 at 22:13
That's why a sample of data would be helpful, although I'm pretty sure there should be posts already that have you covered — camille, Dec 18 '21 at 23:12

score 0 · Answer 1 · answered Dec 20 '21 at 16:00

Have you tried to use one combine verb, like left_join, for example?

SplitDates <- SortedDates %>% 
    group_by(PatientID) %>% 
    mutate(EarliestTestDate = min(AdministrationDate), 
        LatestTestDate = max(AdministrationDate)) %>% 
    ungroup() %>%
    left_join(SortedDates,
        by = c(“PatientID” = “PatientID”, “AdministrationDate” = “EarliestTestDate”)) %>% # picking the score of EarliestTestDate
    left_join(SortedDates,
        by = c(“PatientID” = “PatientID”, “AdministrationDate” = “LatestTestDate”)) %>% # picking the score of EarliestTestDate
    arrange(desc(PatientID)) # now you can make the mutante task that you need.

I suggest to you see the dplyr cheatsheet.

How to mutate new columns in R based on earliest and latest dates for other variables

1 Answers1