What does plot_model ( ) function use to determine values to display in graph?

Question

library(dplyr)
library(ggplot2)
library(lme4)
library(readxl)
library(AICcmodavg)
library(MuMIn)
library(afex)
library(sjPlot)
library(ggeffects)

I ran a GLMM model for nest use of a bird using the following code:

mod.top <- glmer(use ~ boxes.per.post + box.age + lsvo +
                      (1|year),
                    family = "binomial", 
                    data = WODU)

I then created a graph using the following code:

plot_model(mod.top, type = "pred", terms = c("box.age [all] ","boxes.per.post", "lsvo"), pred.type = "fe", 
            legend.title = "", axis.title = "Probability of Use", title = "")

This creates a great graph. The problem is, each pane has a seemingly random lsvo (late season visual obstruction) value. One even gives me a negative number, but the values in my spreadsheet range from 0-100. How does the plot_model function determine the values to show on each of these panes and can they be changed?

I tried searching the functions of this code but was unable to find anything.

Ben Bolker · Answer 1 · 2022-11-06T23:52:18.547

See e.g. the examples in vignette("plot_marginal_effects", package = "sjPlot"), in the "Marginal effects at specific values or levels" section:

The terms-argument not only defines the model terms of interest, but each model term that defines the grouping structure can be limited to certain values. This allows to compute and plot marginal effects for terms at specific values only. To define these values, put them in square brackets directly after the term name: terms = c("c12hour [30, 50, 80]", "c172code [1,3]")

plot_model(fit, type = "pred", terms = c("c12hour [30, 50, 80]", "c172code [1,3]"))

So I think something like

terms = c("box.age [all] ","boxes.per.post", "lsvo [1, 10, 21]")

would give you facets at clean/pretty values of lsvo.

What about the default settings? Following breadcrumbs in the documentation, ?ggpredict says

If a numeric vector is specified as second or third term (i.e. if this vector represents a grouping structure), representative values (see ‘values_at()’) are chosen (unless other values are specified).

... and ?values_at says that the default option is meansd, which

uses the mean value of the moderator as well as one standard deviation below and above mean value to plot the effect of the moderator on the independent variable.

Since [all] is one of the options for specifying values in terms, it's possible that you could pass other arguments for values_at through in this way (e.g. [minmax], [zeromax], [fivenum], etc.), but I haven't tried it.

Other customizations requested (I'm guessing because you haven't specified a reproducible example):

adjust y-axis settings: + scale_y_continuous(labels = scales::percent, limits = c(NA,1)) (see this question)
capitalization in legend: not sure offhand. You could change the factor levels upstream ... or scale_colour_discrete(labels = ...) (but this might override some custom scale that plot_model has set up ...)

Honestly, once I have to do more than a little bit of customization on an sjPlot I usually fall back to using ggpredict to generate the prediction data/confidence intervals for me, then setting up the ggplot from scratch myself ...

Awesome thank you Ben! That worked and now I'm realizing I'm not happy with the appearance of the graph. I want the y scale to display "100%" at the top. Using scale_y_continuous(limits = seq(0,100,25) takes away the % symbol and still doesn't print 100 at the top. I also want to capitalize Duplex and Single in the legend. The sjplot manual shows that some function like plot_gpt have a legend.labels argument but it doesn't look like plot_model does. — blake.struthers, Nov 06 '22 at 23:36

What does plot_model ( ) function use to determine values to display in graph?

1 Answers1