I have a data.table with id, location and date.
library(data.table)
library(dplyr)
library(lubridate)
data<-data.table(
id<-c(rep("1001",8),rep("1002",9),rep("1003",7)),
location<-c(rep("A",4),rep("B",4),rep("A",5),rep("B",4),rep("A",3),rep("B",4)),
date<-c(seq(dmy('01-01-2014'),dmy('01-08-2014'), by = '1 month'),
seq(dmy('01-04-2014'),dmy('01-12-2014'), by = '1 month'),
seq(dmy('01-02-2015'),dmy('01-08-2015'),by = '1 month')))
names(data)<-c("id","location","date")
I want to group_by(id,location) and select say 3 nearest months before change in location and 3 nearest months after the change in location based on the values in months. There could be missing months.
My current solution is to sort and use head/tail for each group and rbind back together.
data<-data[order(id,date)]
data[location=="A",tail(.SD,3),by=id]
data[location=="B",head(.SD,3),by=id]
this will fail if there are multiple dates with same value, in which case I would like to bring back all. Can anyone suggest a better programmatic way to do this with with rleid() perhaps or other?