I have a very large set of claims data (called data2) with 1 row per enrollee and columns enrolid (enrollment id), jan16allwd,...,dec16allwd, as well as some other fields that aren't relevant to this. For each enrollee I'm looking to extract the coefficient of the regression for (allowed claims~month). I've tried this:
allowed <- c(data2$jan16allwd, data2$feb16allwd, data2$mar16allwd, data2$apr16allwd,
data2$may16allwd, data2$jun16allwd, data2$jul16allwd, data2$aug16allwd,
data2$sept16allwd, data2$oct16allwd, data2$nov16allwd, data2$dec16allwd)
months <- (1:12)
betas.allwd <- unlist(lapply(split(data2,data2$enrolid),function(chunk)
{return(coef(lm(allowed~months, data=chunk))[[2]])}))
but keep getting an error about the lengths of the datasets being different. I know it's due to the allowed fields not being split up by enrolid. How can I fix this and return the vector I need?