I don't know how to name the proper title; however, following is my question.
I have a data:
ID Name Type Date Amount
1 AAAA First 2009/7/20 100
1 AAAA First 2010/2/3 200
2 BBBB First 2015/3/10 250
2 CCC Second 2009/2/23 300
2 CCC Second 2010/1/25 400
2 CCC Third 2015/4/9 500
2 CCC Third 2016/6/25 700
I want to remove the data that has same ID
, Name
, and Type
; but the Date
is smaller. Or you can say that keep Date
is the largest.
The result is like:
ID Name Type Date Amount
1 AAAA First 2010/2/3 300
2 BBBB First 2015/3/10 250
2 CCC Second 2010/1/25 700
2 CCC Third 2016/6/25 1200
I know I can use duplicated()
to get the which observations are duplicating.
dt <- fread("
ID Name Type Date
1 AAAA First 2009/7/20
1 AAAA First 2010/2/3
2 BBBB First 2015/3/10
2 CCC Second 2009/2/23
2 CCC Second 2010/1/25
2 CCC Third 2015/4/9
2 CCC Third 2016/6/25
")
dt$Date <- as.Date(dt$Date)
dt[duplicated(ID) & duplicated(Name) & duplicated(Type)]
ID Name Type Date Amount
1: 1 AAAA First 2010/2/3 200
2: 2 CCC Second 2010/1/25 400
3: 2 CCC Third 2016/6/25 700
However, this is not I want. Although it removes the smaller Date
, it cannot keep the third observation(ID
=2, Name
=BBBB, Type
=First). Also, I still need to sum Amount
.
How can I do?