2

My goal is to run a quadratic function using several IVs across time within subject. I have come across some code and am a little confused. Below is a reproducible example of what I am trying to run. Following the code will be my questions.

set.seed(1234)
obs <- 1:200
IV1<- rnorm(length(obs), mean = 1, sd = mean(obs^2) / 4)
IV2<- rnorm(length(obs), mean = 1.5, sd = mean(obs^2) / 8)
IV3<- rnorm(length(obs), mean = .5, sd = mean(obs^3) / 4)
y <- (obs + obs^2 + obs^3) + rnorm(length(obs), mean = 0, sd = mean(obs^2) / 4)
my.data <- data.frame(obs,IV1,IV2,IV3,y,
                      DV = y/10000, 
                      time= c(1,2,3,4,5),
                      Subj= rep(letters[1:20], each =5),
                      group= rep(letters[1:2], each =5))
my.data$group.factor<-as.factor(my.data$group)
my.data$dx <- as.numeric(ifelse(my.data$group == "a", 1, 0))

Polylmer<- lmer(DV ~ poly(time*IV1, 2) +  poly(time*IV2, 2) + poly(time*IV3, 2) +  poly(time*dx, 2) +  (1|Subj), data = my.data)  

My questions are as follow:

  1. In non poly() lmer statement time*IV2 would give coefficients for time and IV2 interaction as well as the lower order coefficients time and IV2. Am I correct that using the poly() statement does not put the lower terms into the model?

  2. I have been taught that if you include the higher terms you should include the lower terms also. Is this still correct with the poly() function in r?

If so Would make sense to use either

Polylmer2<- lmer(DV ~ poly(time, 2)*poly(IV1, 2) +  poly(time, 2)*poly(IV2, 2) + poly(time, 2)*poly(IV3, 2) +  poly(time*dx, 2) +  (1|Subj), data = my.data)      

Polylmer3<- lmer(DV ~ time + IV1 + IV2 + IV3 + dx + poly(time, 2) + poly(IV1, 2) + poly(IV2, 2) + poly(IV3, 2) + poly(time*IV1, 2) +  poly(time*IV2, 2) + poly(time*IV3, 2) +  poly(time*dx, 2) +  (1|Subj), data = my.data) 

I would assume that the two above are equivalent, however, I am wrong as the second gives me an error:

Error: Dropping columns failed to produce full column rank design matrix

  1. What columns did I drop?

Thank you for helping. I am very new to r so I am trying my best to understand what these functions do rather than just follow a recipe.

  • I think it would be helpful if you would be clearer about what terms you want. When you write `poly(time * IV1, 2)`, I take as given that you are expecting terms for `time`, `time^2`, `IV1`, `IV1^2`, and `time:IV1`. Probably also `(time:IV1)^2`. Do you also expect `(time)^2:IV1` and `time:(IV1)^2`? – Gregor Thomas Nov 20 '19 at 00:26
  • @Gregor I didn't even think of ```(time)^2:IV1``` and ```(IV1)^2:time```. I don't think they would make sense. Perhaps I am thinking about this poorly. I have been taught to always include lower order terms but it perhaps it is a rule of thumb rather than a requirement. I want to model the effects of several variables across time on a DV and expect that the relationship is quadratic. I am not interested particularly in the main effects. – Severe_Dragomere Nov 20 '19 at 00:49
  • I agree they wouldn't make sense, but when I see your code with terms like `poly(time, 2)*poly(IV1, 2)`, it seems like they are implied. Also I see that I missed `(time)^2:(IV1)^2`... which I also wouldn't want to include, how about you? – Gregor Thomas Nov 20 '19 at 03:00
  • I *think* what you want is `lmer(DV ~ polym(time, IV1, degree = 2) + polym(time, IV2, degree = 2) + ...)`, but I'm only about 75% sure. 10% of my doubt is that I'm not sure exactly what you want. Another 10% of my doubt is that the `?polym` help page isn't super descriptive. The last 5% of my doubt is that *if* my other assumptions are correct, the output of `polym` might include a `time` and `time^2` variables under different names each time you use it, which may continue to cause full rank errors... – Gregor Thomas Nov 20 '19 at 03:17
  • Though when I test it on `mtcars` with `lm(mpg ~ polym(hp, wt, degree = 2) + polym(hp, qsec, degree = 2), data = mtcars)`, it just produces `NA` coefficients for the duplicates, which is fine. – Gregor Thomas Nov 20 '19 at 03:17

0 Answers0