0

I have a very large set of claims data (called data2) with 1 row per enrollee and columns enrolid (enrollment id), jan16allwd,...,dec16allwd, as well as some other fields that aren't relevant to this. For each enrollee I'm looking to extract the coefficient of the regression for (allowed claims~month). I've tried this:

allowed <- c(data2$jan16allwd, data2$feb16allwd, data2$mar16allwd, data2$apr16allwd, 
data2$may16allwd, data2$jun16allwd, data2$jul16allwd, data2$aug16allwd, 
data2$sept16allwd, data2$oct16allwd, data2$nov16allwd, data2$dec16allwd)

months <- (1:12)

betas.allwd <- unlist(lapply(split(data2,data2$enrolid),function(chunk)
{return(coef(lm(allowed~months, data=chunk))[[2]])}))

but keep getting an error about the lengths of the datasets being different. I know it's due to the allowed fields not being split up by enrolid. How can I fix this and return the vector I need?

michael_p
  • 3
  • 2
  • 1
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Nov 26 '19 at 19:04
  • obviously, the `length` of 'months' is 12. You may need to `rep`licate it to make the lengths same – akrun Nov 26 '19 at 19:05
  • So you want to run a regression of 12 observations for each enrolid? – Parfait Nov 26 '19 at 20:24

0 Answers0