5

I want to plot the predicted probabilities for a multinomial model in R, fitted with the nnet::multinom() function. I have numerical predictors on the log scale.

Even though {ggeffects} should be compatible with multinom(), the plot does not display confidence intervals the same way it does for linear models.

I am new to using to R and to this community, so my apologies if this question is very basic or is missing something essential. Here is a small example:

library(tidyverse)
library(nnet)
library(effects)
library(ggeffects)


df <- data.frame(response = c("1 Better", "1 Better", "1 Better", "2 Medium", "2 Medium", "2 Medium", "3 Worse", "3 Worse", "3 Worse"),
                 count = c(1000, 2000, 4000, 6000, 10000, 3000, 6000, 5000, 11000))

mod1 <- multinom(response ~ log(count), data = df)
summary(mod1)

effects::effect(mod1, term="log(count)", se=TRUE, confidence.level=.95) %>% plot() # Produces CIs.

ggeffects::ggpredict(mod1, terms = "count") %>% plot() + theme_bw() # No confidence intervals.

If others are looking for alternatives to {ggeffects}, I tried these approaches while looking for a solution:

Using effects::effect(): works, confidence intervals are included but the appearance isn't so customisable.

Combining {ggeffects} and {effects}: See this post on R Studio Community in which confidence intervals from the effects package were combined with ggeffects to create a plot. I got the error

Error in FUN(X[[i]], ...) : object 'L' not found

But it worked for that person.

Using {MNLpred} package and its mnl_pred_ova() : didn't work for me, I think because my predictors are in log scale. I got the following error:

Error in eval(parse(text = paste0("data$", xvari))) : attempt to apply non-function

Using mnlAveEffPlot() function from {DAMisc}: Worked but the plots aren't as customisable as I would like.

melex
  • 53
  • 4
  • Consider revising and streamlining your question a little so it is specifically about _one_ problem with one specific command. As of now, you seem to have tried three or more different approaches, all of which produced error messages or you had problems with but you only provide your code for the `ggeffects::ggpredict()` command. If your question is solely about that, consider editing the other things out. – Dr. Fabian Habersack Nov 11 '20 at 10:17

2 Answers2

3

You can do this using ggeffects::ggemmeans().

library(tidyverse)
library(ggthemes)
library(nnet)
library(ggeffects) # package version used: v0.16.0

df <- data.frame(response = c("1 Better", "1 Better", "1 Better", "2 Medium", "2 Medium", "2 Medium", "3 Worse", "3 Worse", "3 Worse"),
                 count = c(1000, 2000, 4000, 6000, 10000, 3000, 6000, 5000, 11000))

mod1 <- multinom(response ~ log(count),
                 data = df)

ggemmeans(mod1, terms = "count") %>% plot() + ggthemes::theme_tufte()

For more information on how to use {ggeffects}, you may also want to take a look at the package documentation and especially at the differences between the ggemmeans() and ggpredict() etc. (e.g. here).

The {ggeffects} package draws on the output created by {effects} but, and I believe this is what you are looking for, makes it much easier to customize the plot using standard ggplot commands.

Dr. Fabian Habersack
  • 1,111
  • 12
  • 30
  • Thanks very much - I have edited my question, feel free to let me know if could be improved further. – melex Nov 13 '20 at 08:57
0

The MNLpred package cannot handle the log() inside the regression function but works when you compute the log-scale beforehand.

# Packages
library(tidyverse)
library(nnet)
library(MASS)
library(MNLpred)
library(scales)
library(ggeffects)
library(ggthemes)

df <- data.frame(response = c("1 Better", "1 Better", "1 Better", 
                              "2 Medium", "2 Medium", "2 Medium", 
                              "3 Worse", "3 Worse", "3 Worse"),
                 count = c(1000, 2000, 4000, 
                           6000, 10000, 3000, 
                           6000, 5000, 11000))

mod1 <- multinom(response ~ log(count), data = df)
summary(mod1)

# Log-scaled
df$count_log <- log(df$count)

# Regression
mod2 <- multinom(response ~ count_log,
                 data = df,
                 Hess = TRUE)

# The models are identical:
coef(mod1) == coef(mod2)

After this step, you can use the mnl_pred_ova or the mnl_fd2_ova function for predicted probabilities or first differences/predicted marginal effects.

# 10 steps for predictions
steps <- (max(df$count_log) - min(df$count_log))/9

pred1 <- mnl_pred_ova(mod2,
                      data = df,
                      by = steps,
                      x = "count_log")

x_breaks <- seq(from = min(df$count_log),
                to = max(df$count_log), 
                length.out = 5)

x_labels <- seq(from = min(df$count),
                to = max(df$count),
                length.out = 5)

pred1$plotdata %>%
  ggplot(aes(x = count_log,
             y = mean,
             ymin = lower,
             ymax = upper)) +
  facet_wrap(. ~ response) +
  geom_line() +
  geom_ribbon(alpha = 0.2) +
  scale_y_continuous(labels = percent_format()) +
  scale_x_continuous(breaks = x_breaks,
                     labels = x_labels) +
  theme_bw()

Plot with predicted probabilities for three categories

Or the predicted marginal effects:

pred_fd <- mnl_fd2_ova(model = mod2,
                       x = "count_log",
                       value1 = min(df$count_log),
                       value2 = max(df$count_log),
                       data = df)

pred_fd$plotdata_fd %>%
  ggplot(aes(x = categories,
             y = mean,
             ymin = lower,
             ymax = upper)) +
  geom_pointrange() +
  scale_y_continuous(labels = percent_format()) +
  labs(title = "Predicted effect of Count on responses",
       x = "Categories",
       y = "Predicted marginal effect") +
  theme_bw()

enter image description here

MNeumann
  • 11
  • 2