Selecting unique rows based on value and earliest date and keeping all columns, R

Question

I have a dataframe with a column of ID's where some occur multiple times, a column with dates, and more random columns. I want to select the first occurence of the ID based on the date, and also keep the remaining columns in the data set.

ID = c("1", "2", "3", "3", "3", "4")
date = c("2019-10-06", "2019-08-29", "2019-08-09", "2019-02-01", "2019-11-17", "2019-05-24")
filler = c(1:6)
df <- data.frame(ID, date, filler)

df$date <- as.Date(df$date)
dfunique <- df %>% group_by(ID) %>% summarise(min_date = min(date))

I have tried with the summarise function and end up selecting the correct rows, but excluding the filler column. I have also tried, the distinct function which keeps all columns, but it choses the wrong rows.

dfunique2 <- df %>% distinct(ID, .keep_all = TRUE)

I hope to get a dataframe like

|ID  | date     | filler|
|:--:|:--------:|:-----:|
| 1  |2019-10-06|  1    |
| 2  |2019-08-29|  2    |
| 3  |2019-02-01|  4    |
| 4  |2019-05-24|  6    |

How can I include the remaining column(s) of my dataframe while selecting the correct rows? Thanks.

`dfunique <- df %>% group_by(ID) %>% slice(which.min(date))` — Ronak Shah, May 01 '21 at 12:33

score 0 · Accepted Answer · answered May 01 '21 at 12:34

df$date <- as.Date(df$date)
df %>% group_by(ID) %>%
  filter(date == min(date))

# A tibble: 4 x 3
# Groups:   ID [4]
  ID    date       filler
  <chr> <date>      <int>
1 1     2019-10-06      1
2 2     2019-08-29      2
3 3     2019-02-01      4
4 4     2019-05-24      6

Selecting unique rows based on value and earliest date and keeping all columns, R

1 Answers1