Predicted SE vary from ggeffects::ggpredict - what am I doing wrong? 'Object not found' for predict function

Question

I am running an analysis in r on the effect of canopy cover (OverheadCover, proportion bounded by 0,1) and the number of carcasses placed on the same location (CarcassNumber, factor with 2 levels) on the proportion of carrion eaten by birds (ProportionBirdsScavenging, proportion bounded by 0,1). I plotted this interaction by modelling the effect of OverheadCover on ProportionBirdsScavenging for the separate values of CarcassNumber, then plotting it in the same graph. Having done this, I saw discrepancies in the SE's calculated by plot_model(glmm_interaction, type = "int") and the SE's I calculated. Here started an investigation. This journey led me through the source codes of plot_model, plot_type_int, ggpredict and stopped with ggpredict_helper. I did unfortunately not find the calculation of SE's, but I found proof of the difference in SE's. I always was pretty confident that I calculated my SE's correctly, but now I'm not so sure anymore. Please see my code below.

#rm(list = ls())

library(glmmTMB)
library(dplyr)

data_both <- data.frame(ProportionBirdsScavenging = c(0.406192519926425, 0.871428571428571, 0.452995391705069, 0.484821428571429, 0.795866569978245, 0.985714285714286, 0.208571428571429, 0.573982970671712, 0.694285714285714, 0.930204081632653, 0.0483709273182957, 0.0142857142857143, 0.661904761904762, 0.985714285714286, 0.0142857142857143, 0.0142857142857143),
                        pointWeight = c(233, 17, 341, 128, 394, 46, 5, 302, 10, 35, 57, 39, 12, 229, 28, 116),
                        OverheadCover = c(0.671, 0.04, 0.46, 0.65, 0.02, 0, 0.8975, 0.585, 0.6795, 0.0418, 0.5995, 0.6545, 0.02, 0, 0.92, 0.585),
                        CarcassNumber = as.factor(c(1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2)),
                        Area = c("Hamert", "KempenBroek", "KempenBroek", "KempenBroek", "Markiezaat", "Markiezaat", "Meinweg", "Valkenhorst", "Hamert", "KempenBroek", "KempenBroek", "KempenBroek", "Markiezaat", "Markiezaat", "Meinweg", "Valkenhorst"))
data_both$pointWeight_scaled <- scales::rescale(data_both$pointWeight, to = c(0.0001,1)) # rescale weights

glmm_interaction <- glmmTMB(ProportionBirdsScavenging ~ OverheadCover * CarcassNumber + (1|Area), data = data_both, beta_family(link = "logit"), weights = pointWeight_scaled)
ggeffects::ggpredict(glmm_interaction, terms = c("OverheadCover", "CarcassNumber [1:2]")) # SE's calculated by r

# Calculate the SE's for CarcassNumber 1
df_first_carcasses <- filter(data_both, CarcassNumber == 1) # create df with only first carcasses
myglmm <- glmmTMB(ProportionBirdsScavenging ~ OverheadCover, data = df_first_carcasses, beta_family(link = "logit"), weights = pointWeight_scaled)
new.xglmm <- expand.grid(OverheadCover = seq(min(data_both$OverheadCover), max(data_both$OverheadCover), length.out = 1000)) %>%
  mutate(Area = "Hamert", pointWeight_scaled = 1) # pad new.xglmm with an arbitrary value for Area and pointWeight_scaled, then exclude them in predict, otherwise error -> https://stackoverflow.com/questions/54411851/mgcv-how-to-use-exclude-argument-in-predict-gam
new.yglmm <- data.frame(predict(myglmm, new.xglmm, type = "link", exclude = c("Area","pointWeight_scaled"), se.fit = TRUE)) %>% # exclude Area and pointWeight_scaled from the prediction
  mutate(ProportionBirdsScavenging = plogis(fit)) %>% # calculate the ProportionBirdsScavenging on response scale
  rename(SE.untransformed = se.fit, untransformed.predictions = fit)
addTheseglmm1 <- mutate(data.frame(new.xglmm, new.yglmm), # calculate the lwr and upr bounds using the untransformed predictions and SE, then transformed to probability
                        lwr = plogis(untransformed.predictions - SE.untransformed),
                        upr = plogis(untransformed.predictions + SE.untransformed))
addTheseglmm1[c(1,45,501,653,707,1000),c(1,6,5)] # compare my SE's to the ggpredict SE's

#  calculated by ggpredict   >   manually calculated
#    x | Predicted |   SE |  >      SE.untransformed
# -------------------------  >   -------------------
# 0.00 |      0.82 | 0.44 |  >                  0.43
# 0.04 |      0.80 | 0.41 |  >                  0.40
# 0.46 |      0.56 | 0.22 |  >                  0.21
# 0.60 |      0.47 | 0.26 |  >                  0.25
# 0.65 |      0.44 | 0.29 |  >                  0.28
# 0.92 |      0.28 | 0.47 |  >                  0.46
#  ( CarcassNumber = 1 )

# Same for CarcassNumber 2
df_second_carcasses <- filter(data_both, CarcassNumber == 2) # create df with only second carcasses
myglmm <- glmmTMB(ProportionBirdsScavenging ~ OverheadCover, data = df_second_carcasses, beta_family(link = "logit"), weights = pointWeight_scaled)
new.xglmm <- expand.grid(OverheadCover = seq(min(data_both$OverheadCover), max(data_both$OverheadCover), length.out = 1000)) %>%
  mutate(Area = "Hamert", pointWeight_scaled = 1) 
new.yglmm <- data.frame(predict(myglmm, new.xglmm, type = "link", exclude = c("Area","pointWeight_scaled"), se.fit = TRUE)) %>% 
  mutate(ProportionBirdsScavenging = plogis(fit)) %>%
  rename(SE.untransformed = se.fit, untransformed.predictions = fit)
addTheseglmm2 <- mutate(data.frame(new.xglmm, new.yglmm),
                        lwr = plogis(untransformed.predictions - SE.untransformed),
                        upr = plogis(untransformed.predictions + SE.untransformed))

addTheseglmm2[c(1,45,501,653,707,1000),c(1,6,5)]

#  calculated by ggpredict   >   manually calculated
#    x | Predicted |   SE |  >      SE.untransformed
# -------------------------  >   -------------------
# 0.00 |      0.96 | 1.01 |  >                  1.21
# 0.04 |      0.95 | 0.94 |  >                  1.10
# 0.46 |      0.16 | 0.86 |  >                  0.96
# 0.60 |      0.04 | 1.11 |  >                  1.32
# 0.65 |      0.02 | 1.22 |  >                  1.46
# 0.92 |      0.00 | 1.83 |  >                  2.31
#  ( CarcassNumber = 2 )

The differences are most visible for CarcassNumber 2. Is my calculation incorrect? I suspect that this discrepancy might have to do with the including and excluding of pointWeight_scaled in the predict function. If I don't do this, it returns me an error saying Error in eval(extras, data, env) : object 'pointWeight_scaled' not found. Is this a common problem? I read here that including and then excluding it resolves the issue mgcv: How to use 'exclude' argument in predict.gam?. Is this not the proper way?

I hope that someone can shed some light on this problem.

At first glace it seems like `ggeffects::ggpredict` is using your full model, but you are calculating SE's on a reduced model. — Axeman, Feb 05 '20 at 16:27
Details are in `ggeffects:::get_predictions_glmmTMB`, though it may take some effort to parse out what is happening. — Axeman, Feb 05 '20 at 16:40
@ Axeman yes I am calculating SE's on a reduced model. I don't see how to otherwise produce these models and SE's separate for each value of ```CarcassNumber```. Basically, I'm trying to reproduce ```sjPlot::plot_model(glmm_interaction, type = "int", ci.lvl = 0.682)```, but then with ```ggplot```. I'll have a look at ```ggeffects:::get_predictions_glmmTMB``` — Peter, Feb 05 '20 at 18:45

score 1 · Answer 1 · answered Feb 05 '20 at 22:50

hm, when I run your example, I get the same results as you expect:

library(glmmTMB)
library(ggeffects)

data_both <- data.frame(ProportionBirdsScavenging = c(0.406192519926425, 0.871428571428571, 0.452995391705069, 0.484821428571429, 0.795866569978245, 0.985714285714286, 0.208571428571429, 0.573982970671712, 0.694285714285714, 0.930204081632653, 0.0483709273182957, 0.0142857142857143, 0.661904761904762, 0.985714285714286, 0.0142857142857143, 0.0142857142857143),
                        pointWeight = c(233, 17, 341, 128, 394, 46, 5, 302, 10, 35, 57, 39, 12, 229, 28, 116),
                        OverheadCover = c(0.671, 0.04, 0.46, 0.65, 0.02, 0, 0.8975, 0.585, 0.6795, 0.0418, 0.5995, 0.6545, 0.02, 0, 0.92, 0.585),
                        CarcassNumber = as.factor(c(1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2)),
                        Area = c("Hamert", "KempenBroek", "KempenBroek", "KempenBroek", "Markiezaat", "Markiezaat", "Meinweg", "Valkenhorst", "Hamert", "KempenBroek", "KempenBroek", "KempenBroek", "Markiezaat", "Markiezaat", "Meinweg", "Valkenhorst"))
data_both$pointWeight_scaled <- scales::rescale(data_both$pointWeight, to = c(0.0001,1)) # rescale weights

df_second_carcasses <- dplyr::filter(data_both, CarcassNumber == 2) # create df with only second carcasses
m <- glmmTMB(ProportionBirdsScavenging ~ OverheadCover, data = df_second_carcasses, beta_family(link = "logit"), weights = pointWeight_scaled)

ggpredict(m, "OverheadCover")
#> 
#> # Predicted values of ProportionBirdsScavenging
#> # x = OverheadCover
#> 
#>    x | Predicted |   SE |       95% CI
#> --------------------------------------
#> 0.00 |      0.96 | 1.21 | [0.70, 1.00]
#> 0.02 |      0.95 | 1.16 | [0.67, 0.99]
#> 0.04 |      0.94 | 1.10 | [0.65, 0.99]
#> 0.58 |      0.05 | 1.27 | [0.00, 0.40]
#> 0.60 |      0.04 | 1.31 | [0.00, 0.38]
#> 0.65 |      0.03 | 1.47 | [0.00, 0.32]
#> 0.68 |      0.02 | 1.55 | [0.00, 0.30]
#> 0.92 |      0.00 | 2.31 | [0.00, 0.13]
#> Standard errors are on link-scale (untransformed).

^{Created on 2020-02-05 by the reprex package (v0.3.0)}

A reason might be that glmmTMB was just updated on CRAN and has the re.form-argument enabled in predict(). I have included this in the very latest commit on GitHub (i.e. now predict() in ggeffects for glmmTMB models also use the re.form argument), so maybe you get identical results when you update ggeffects from GitHub?

Hi Daniel. Thanks for sharing your thoughts. I tried ```library(devtools)``` ```install_github("strengejacke/ggeffects")```, but the package installation failed. As I'm new with updating packages from GitHub, can you explain how can I do this correctly? — Peter, Feb 06 '20 at 06:22
If you provide some more details about the errors you get, I can probably help. Else, I've uploaded the source and binary version to GitHub, which you can install directly from R/RStudio: https://github.com/strengejacke/ggeffects/releases/tag/0.14.1.1 — Daniel, Feb 06 '20 at 08:27
Note that you need to update glmmTMB to version 1.0 from CRAN — Daniel, Feb 06 '20 at 08:28

Predicted SE vary from ggeffects::ggpredict - what am I doing wrong? 'Object not found' for predict function

1 Answers1