0

I have a data be like this:

enter image description here

Each same patient has several records at different ages. How can I just keep records of each patient that with their oldest ages in the data? Thank you!

oldest_age <- aggregate(AGE~RANDID,data=df,max)

This was the way I did, but it only kept the age and randid column.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Louis
  • 11

1 Answers1

0

It would be much easier to help you when you share your data using dput(your data).

Anyway, you could use .I in data.table, which finds the number of rows under a specific condition.

library(data.table)

df <- data.table(
  RANDID = c(2448, 2448, 6238, 6238, 6238, 9428, 9428, 10552, 10552),
  SEX = c(1, 1, 2, 2, 2, 1, 1, 2, 2),
  TOTCHOL = c(195, 209,250, 260, 237, 245, 283, 225, 232),
  AGE = c(39, 52, 46, 52, 58, 48, 54, 61, 67)
)

df[df[, .I[which.max(AGE)], by = RANDID]$V1]

output

   RANDID   SEX TOTCHOL   AGE
    <num> <num>   <num> <num>
1:   2448     1     209    52
2:   6238     2     237    58
3:   9428     1     283    54
4:  10552     2     232    67
YH Jang
  • 1,306
  • 5
  • 15