-1

I'm testing for random intercepts as a preparation for growth curve modeling.

Therefore, I've first created a wide subset and then converted it to a Long data set.

Calculating my ModelM1 <- gls(ent_act~1, data=school_l) with the long data set, I get an error message as I have missing values. In my long subset these values are stated as NaN.

When applying temp<-na.omit(school_l$ent_act), I can calculate ModelM1. But, when calculating ModelM2 ModelM2 <- lme(temp~1, random=~1|ID, data=school_l), then I get the error message of my variables being of unqueal lengths.

How can I deal with those missing values? Any ideas or recommendations?

OTStats
  • 1,820
  • 1
  • 13
  • 22
Sventon
  • 11
  • 2
  • Pleaseread [this post](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610) and then edit your question to provide a simple, self-contained example of your problem. – Limey Jun 06 '20 at 15:25

1 Answers1

0

What you might get success with would be to make a temp dataframe where your remove entire lines indexed by negation of the missing condition: !is.na(school_1$ent_act)

temp<-school_l[ !is.na(school_l$ent_act), ]

Then re-run the lme call. There should now be no mismatch of variable lengths.

ModelM2 <- lme(ent_act  ~1, random= ~1|ID, data=school_l)

Note that using school_l is going to be potentially confusing because it looks so much like school_1 when viewed in Times font.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • Thank you! I think that worked. However, I somehow get "wrong" results as I get the same results for my model 1 and 2 which is not true looking at the data.. Also the intercept predicted by model 1 is far from my data. Does anyone see a mistake in my code? – Sventon Jun 07 '20 at 13:47
  • ### Test for random intercept ```temp<-!is.na(school_l$ent_act) ``` ```ModelM1 <- gls(temp~1, data=school_l) ``` ```summary(ModelM1) ``` # average ent_act across all people and all time points: 0.6509222, p-value = 0 ```logLik(ModelM1)``` # fit statistic: the closer to 0 the better (the less error you are making in the prediction) --> -1510.926 ```deviance1 <- logLik(ModelM1)*-2``` # Deviance – Sventon Jun 07 '20 at 13:48
  • # random coefficient model ```ModelM2 <- lme(temp~1, random=~1|ID, data=school_l) ``` ```summary(ModelM2) ``` ```logLik(ModelM2) ``` # -1510.926 (same as for model 1) ```deviance2 <- logLik(ModelM2)*-2``` ```deviance1 - deviance2``` # difference in deviance: -3.159585e-09 (almost 0) ```anova(ModelM1, ModelM2) ``` # p-value = 1, absolutely not significant. – Sventon Jun 07 '20 at 13:48
  • Do _not_ use comments to amend or modify questions. I think it might require a new question since this one was how to avoid unequal variable lengths and now (with this issue apparently solved) you are then having different problems with model fitting. – IRTFM Jun 07 '20 at 18:44
  • Okay, that makes sense. Sorry for that. Thank you! – Sventon Jun 08 '20 at 09:04