0

I have a table with three columns,

ID      Drug      Date 
ABCDE   A01AA01   2000-01-01
ABCDE   A01AA01   2000-04-01
FGHIJ   A01AA01   2000-02-02
FGHIJ   A01AA01   2000-05-02
...

where the date refers to the day the drug was purchased, meaning that for each ID + Drug combination there are multiple dates (since people have to buy their medicine multiple times over the course of a few years). I'm trying to create a new table with the newest instance of every ID + Drug combination, i.e. the last time they purchased the drug. This is probably easy, but I've been struggling with this for a few days and can't find the solution. Thanks in advance.

Haddi
  • 15
  • 3

2 Answers2

1

With df being your data, this could help:

library(dplyr)

df %>% group_by(ID,Drug) %>% filter(Date==max(Date))
Duck
  • 39,058
  • 13
  • 42
  • 84
1

First make sure your dates are in Date objects. (make sure the days/months are correct)

df$Date <- as.Date(df$Date), format="%Y-%d-%m")

then use

aggregate(Date ~ ., df, max)

     ID    Drug       Date
1 ABCDE A01AA01 2000-04-01
2 FGHIJ A01AA01 2000-05-02
Daniel O
  • 4,258
  • 6
  • 20