2

I would like to simulate quantities of interest from a model estimated with MCMCglmm more or less the way Zelig package does. In Zelig you can set the values you want for the independent values and software calculates the result for the outcome variable (expected value, probability, etc). An example:

# Creating a dataset:
set.seed(666)
df <- data.frame(y=rnorm(100,20,20),z=rnorm(100,50,70))

# Loading Zelig
library(Zelig)

# Model
m1.zelig <- zelig(y~z, data=df, model="ls")
summary(m1.zelig)

# Simulating z = 10
s1 <- setx(m1.zelig, z = 10)
simulation <- sim(m1.zelig, x = s1)
summary(simulation)

As we can see, if z = 10 y is approximately 17.

# Same model with MCMCglmm
library(MCMCglmm)
m1.mcmc <- MCMCglmm(y~z, data=df, family = "gaussian", verbose = FALSE)
summary(m1.mcmc)

Is there any way to simulate z = 10 with the posterior distribution estimated by MCMCglmm and get the expected value of y? Thank you very much!

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
danilofreire
  • 503
  • 1
  • 5
  • 18

1 Answers1

3

You can simulate, but not as easily as in Zelig. You have to know a little more about the structure of the model you're fitting and the way the parameters are stored in the MCMCglmm object.

Set up data and fit:

set.seed(666)
df <- data.frame(y=rnorm(100,20,20),z=rnorm(100,50,70))
library(MCMCglmm)
m1.mcmc <- MCMCglmm(y~z, data=df, family = "gaussian", verbose=FALSE)

The most common protocol in R for prediction and simulation is to set up a new data frame with the same structure as the original data:

predframe <- data.frame(z=10)

Construct the model matrix for the linear model:

X <- model.matrix(~z,data=predframe)

Now use the chains of coefficients stored in the Sol ("solution") component of the MCMCglmm object; for convenience, I've set this up as a matrix calculation.

predframe$y <- X %*% t(m1.mcmc$Sol)

If you want to simulate for more complicated models (linear or generalized linear mixed models) then you'll need to work a little harder (handle random effects and inverse-link functions appropriately) ...

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • Dear Ben, thank you very much for your quick reply. That's exactly what I wanted to do. Regarding simulation for more complex models, do you have any suggestion from where to start? Once again, thanks for the response. – danilofreire Jan 11 '14 at 02:38
  • 1
    I'd probably make sure I'd gone carefully through the `MCMCglmm` vignettes, which have a lot of information, and make sure you know the model structures: for example, you want something like `Z ~ model.matrix(~[random effects formula, with intercepts suppressed])`; then multiply `Z` by the `$Liab` component (latent effects). You need to set `pl=TRUE` when running the model to get it to save the latent effects. – Ben Bolker Jan 11 '14 at 02:51
  • your suggestion works great. However, when I tried to include more predictors and run a Poisson model the results look very different from those I got in `Zelig`. I included other predictors in the `predframe`, repeated the same model matrix you described above (plus the additional variables) and multiplied by the `$Sol' component of a (single level) Poisson model estimated with `MCMCglmm'. Finally, to obtain the quantity of interest, I typed `exp(mean(predframe$y))`. Is there anything else I might be missing? Thanks again! – danilofreire Jan 14 '14 at 08:01
  • the only thing that looks funny is taking the mean of the linear predictor (`predframe$y`) before exponentiating -- why not just `exp(predframe$y)` ? Otherwise, a reproducible example would be best. (Maybe you should ask a new question rather than extending this one -- I'm not sure.) – Ben Bolker Jan 14 '14 at 13:48