This is a use case where we have timestamped data with id (e.g. multiple observations over time for each subject), and we want to use all the previous measurements to predict the last one in our dataset.
This is related to the question: How to select the first and last row within a grouping variable in a data frame?
Currently I'm working with the data.table package which is very efficient in selecting the first or last row per group using the solution in the linked question.
When I try to select the first N_g-1 rows (where N_g is the number of rows in the current group) the query takes an very long time. Does anybody know of an efficient way to do something like that. Here's what I'm using currently:
firstn_elements <- dt[, .SD[1:(.N-1)], by=subject_id]