2

I am fitting the following regression: model <- glm(DV ~ conditions + predictor + conditions*predictor, family = binomial(link = "probit"), data = d).

I use 'sjPlot' (and 'ggplot2') to make the following plot:

library("ggplot2")
library("sjPlot")
plot_model(model, type = "pred", terms = c("predictor", "conditions")) +
  xlab("Xlab") +
  ylab("Ylab") +
  theme_minimal() +
  ggtitle("Title")>

pred plot

But I can't figure out how to add a layer showing the distribution on the conditioning variable like I can easily do by setting "hist = TRUE" using 'interplot':

library("interplot")
interplot(model, var1 = "conditions", var2 = "predictor", hist = TRUE) +
      xlab("Xlab") +
      ylab("Ylab") +
      theme_minimal() +
      ggtitle("Title") 

plot with distribution on predictor variable

I have tried a bunch of layers using just ggplot as well, with no success

ggplot(d, aes(x=predictor, y=DV, color=conditions))+
  geom_smooth(method = "glm") +
  xlab("Xlab") +
  ylab("Ylab") +
  theme_minimal() +
  ggtitle("Title")

ggplot.

I am open to any suggestions!

Ingrid
  • 25
  • 5
  • I guess this will be helpful, if you are willing to have the frequencies outside the plot https://stackoverflow.com/questions/8545035/scatterplot-with-marginal-histograms-in-ggplot2 – z-cool Jul 10 '20 at 12:04

1 Answers1

2

I've obviously had to try to recreate your data to get this to work, so it won't be faithful to your original, but if we assume your plot is something like this:

p <- plot_model(model, type = "pred", terms = c("predictor [all]", "conditions")) +
  xlab("Xlab") +
  ylab("Ylab") +
  theme_minimal() +
  ggtitle("Title")

p

enter image description here

Then we can add a histogram of the predictor variable like this:

p + geom_histogram(data = d, inherit.aes = FALSE, 
                   aes(x = predictor, y = ..count../1000),
                   fill = "gray85", colour = "gray50", alpha = 0.3)

enter image description here

And if you wanted to do the whole thing in ggplot, you need to remember to tell geom_smooth that your glm is a probit model, otherwise it will just fit a normal linear regression. I've copied the color palette over too for this example, though note the smoothing lines for the groups start at their lowest x value rather than extrapolating back to 0.

ggplot(d, aes(x = predictor, y = DV, color = conditions))+
  geom_smooth(method = "glm", aes(fill = conditions),
              method.args = list(family = binomial(link = "probit")),
              alpha = 0.15, size = 0.5) +
  xlab("Xlab") +
  scale_fill_manual(values = c("#e41a1c", "#377eb8")) +
  scale_colour_manual(values = c("#e41a1c", "#377eb8")) +
  ylab("Ylab") +
  theme_minimal() +
  ggtitle("Title") + 
  geom_histogram(aes(y = ..count../1000),
                 fill = "gray85", colour = "gray50", alpha = 0.3)

enter image description here


Data

set.seed(69)

n_each     <- 500
predictor  <- rgamma(2 * n_each, 2.5, 3)
predictor  <- 1 - predictor/max(predictor)
log_odds   <- c((1 - predictor[1:n_each]) * 5 - 3.605, 
              predictor[n_each + 1:n_each] * 0 + 0.57)
DV         <- rbinom(2 * n_each, 1, exp(log_odds)/(1 + exp(log_odds)))
conditions <- factor(rep(c("  ", " "), each = n_each))
d          <- data.frame(DV, predictor, conditions)
Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • 1
    THANK YOU SO MUCH. This is perfect! And really really impressive recreation of my data! Thanks again, you just saved my weekend. – Ingrid Jul 10 '20 at 13:42