I have a longitudinal data set of 142,415 rows and 965 columns. For each ID in the data set, there are multiple rows, not necessarily the same number of rows for each ID.
I would like to get the last row (data is already sorted) for each ID and created a data frame of just those, keeping all the remaining 964 columns of data.
When I look at previous questions addressing this, a lot of the suggestions use aggregate()
and I can't use that (at least from what I know) because I have too many columns.
I did try the following but that's tripped up my computer so I'm wondering if there's a faster way to do this than making a list and then forming a data frame from it:
data.list<-by(data.in, data.in$ID, tail, n=1)
data.new<-do.call("rbind", as.list(data.list))