1

I have a mixed-effects model that I am running in the package MCMCglmm in R. There are several possible predictors that I am interested in exploring. How can I carry out model simplification to exclude uninformative predictors from the model?

In other contexts, I have used stepwise backwards regression i.e. starting with a "full" model which contains all of the predictors that I am interested in, and using drop1() to find uninformative variables, followed by update() to remove them from the model.

Neither of these functions appear to work with MCMCglmm objects. What is my best alternative? I guess that one reason that drop1() doesn't work is that the MCMCglmm object does not have an AIC value. Could the DIC value be used in a similar way instead? I don't know much about DIC values, but in the example below they don't seem to be behaving like AIC would: after removing a variable from the model, the DIC is reduced by over 100.

For update(), clearly it's possible to specify the model again from scratch, but it involves copy/pasting quite a long function call, and gets even more complex with longer model formulae that include interactions. I wondered if there is a simpler alternative.

I'm also wondering if there is a reason that there has not been a method written for these functions to work with MCMCglmm objects - is what I am trying to do in some way invalid?

Reproducible example using partially invented data:

set.seed(1234)
library(MCMCglmm)

data(bird.families) 
n <- Ntip(bird.families)

# Create some dummy variables
d <- data.frame(taxon = bird.families$tip.label, 
                X1 = rnorm(n),
                X2 = rnorm(n),
                X3 = sample(c("A", "B", "C"), n, replace = T),
                X4 = sample(c("A", "B", "C"), n, replace = T))

# Simulate a phenotype composed of phylogenetic, fixed and residual effects
d$phenotype <- rbv(bird.families, 1, nodes="TIPS") +
               d$X1*0.7 + 
               ifelse(d$X3 == "B", 0.5, 0) + 
               ifelse(d$X3 == "C", 0.8, 0) +
               rnorm(n, 0, 1)  

# Inverse matrix of shared phyloegnetic history
Ainv <- inverseA(bird.families)$Ainv

# Set priors
prior <- list(R = list(V = 1, nu = 0.002), 
              G = list(G1 = list(V = 1, nu = 0.002)))

# Run model
model <- MCMCglmm(phenotype ~ X1 + X2 + X3 + X4, 
                  random = ~taxon, 
                  ginverse = list(taxon=Ainv),
                  data = d, 
                  prior = prior, 
                  verbose = FALSE)

summary(model)

# drop1(model, test = "Chisq") # doesn't work
# update(model, ~.- X2)        # doesn't work

model2 <- MCMCglmm(phenotype ~ X1 + X3 + X4, 
                  random = ~taxon, 
                  ginverse = list(taxon=Ainv),
                  data = d, 
                  prior = prior, 
                  verbose = FALSE)

summary(model2)

Full model output:

 Iterations = 3001:12991
 Thinning interval  = 10
 Sample size  = 1000 

 DIC: 145.2228 

 G-structure:  ~taxon

      post.mean l-95% CI u-95% CI eff.samp
taxon     1.808   0.2167    3.347    17.88

 R-structure:  ~units

      post.mean  l-95% CI u-95% CI eff.samp
units    0.6712 0.0003382    1.585    20.98

 Location effects: phenotype ~ X1 + X2 + X3 + X4 

             post.mean   l-95% CI   u-95% CI eff.samp  pMCMC    
(Intercept) -0.6005279 -1.5753606  0.2689091   1000.0  0.198    
X1           0.7506249  0.5220491  1.0033785    506.8 <0.001 ***
X2          -0.0339347 -0.2452923  0.2085228    900.7  0.712    
X3B          0.6276852  0.1238884  1.1727409   1000.0  0.022 *  
X3C          1.0533919  0.4244092  1.5095804   1000.0 <0.001 ***
X4B         -0.0002833 -0.5099753  0.5103896    478.3  0.978    
X4C          0.0723440 -0.4435520  0.6193011   1000.0  0.776    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Reduced model output:

 Iterations = 3001:12991
 Thinning interval  = 10
 Sample size  = 1000 

 DIC: 296.4755 

 G-structure:  ~taxon

      post.mean l-95% CI u-95% CI eff.samp
taxon     1.542  0.07175      3.1    26.33

 R-structure:  ~units

      post.mean  l-95% CI u-95% CI eff.samp
units    0.8155 0.0004526    1.661    21.89

 Location effects: phenotype ~ X1 + X3 + X4 

            post.mean  l-95% CI  u-95% CI eff.samp  pMCMC    
(Intercept) -0.606920 -1.380267  0.335225   1000.0  0.172    
X1           0.750929  0.540667  1.011041   1000.0 <0.001 ***
X3B          0.633123  0.149282  1.243790   1000.0  0.032 *  
X3C          1.047085  0.518664  1.570641   1000.0 <0.001 ***
X4B          0.007019 -0.528080  0.512161    509.2  0.998    
X4C          0.063093 -0.490103  0.590965   1000.0  0.818    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
user2390246
  • 257
  • 1
  • 8

0 Answers0