0

I would like to ask for some help with depicting the slopes generated by a lmer() model.

The data that I have is the mass volume of different rats across different days. Each rat has different time points where they took the measurement of that volume.

For rat 1 I have volume c(78,304,352,690,952,1250) at days c(89,110,117,124,131,138) that belong to country Chile

For rat 2 I have volume c(202,440,520,870,1380) at days c(75,89,96,103,110) that belong to country Chile.

For rat 3 I have volume c(186,370,620,850,1150) at days c(75,89,96,103,110) that belong to country Chile.

For rat 4 I have volume c(92,250,430,450,510,850,1000,1200) at days c(47,61,75,82,89,97,103,110) that belong to country England.

For rat 5 I have volume c(110,510,710,1200) at days c(47,61,75,82) that belong to country England.

For rat 6 I have volume c(115,380,480,540,560,850,1150,1350) at days c(47,61,75,82,89,97,103,110) that belong to country England.

The lmer model is:

 m1 <- lmer(lVolume ~ Country*Day + (1|Rat))

I managed to plot the curves of my model by using:

m1%>% 
  augment() %>% 
  clean_names() %>% 
  ggplot(data = .,
         mapping = aes(x = day,
                       y = exp(l_volume),
                       group = rat)) +
  geom_point(alpha = 0.5) +
  geom_line(alpha = 0.5) +
  geom_point(aes(y = exp(fitted)),
             color = "red") + 
  geom_line(aes(y = exp(fitted)),
            color = "red") + 
  expand_limits(x = 0 , y = 0)

This model gave me predictions for new data points based on the model m1 for each of the rats across country.

From this lmer() I have one slope across the whole measurements, this is:

enter image description here

And by exp(predicted): enter image description here

However, I would like to plot this in a different way. I would like to plot the slope generated by each of the levels of country that I have.

The red lines would be the exp(slopes) generating by Chile, and England, but also depict the exp(slope) of the whole model containing both levels.

So, initially I thought that creating three lmer() models:

 m1 <- lmer(lVolume ~ Country*Day + (1|Rat))
 m2 <- lmer(lVolume ~ Day + (1|Rat)) (Rats in Chile)
 m3 <- lmer(lVolume ~ Day + (1|Rat)) (Rats in England)

But I noticed that m2 and m3 are quite different models because they do not have the interaction from Country that is something that I would like to check. So, I don't know what to do here.

Update

I tried this and kind of worked:

Final.Fixed<-effect(c("Country*Day"), m1,
                     xlevels=list(Day=seq(0,168,14)))
 Final.Fixed<-as.data.frame(Final.Fixed)
 
 Final.Fixed.Plot <-ggplot(data = Final.Fixed, aes(x = Day, y =exp(fit), group=Country))+
   coord_cartesian(xlim=c(0,170),ylim = c(0,8000))+ 
   geom_line(aes(color=Country), size=2)+
   geom_ribbon(aes(ymin=exp(fit-se), ymax=exp(fit+se),fill=Country),alpha=.2)+
   xlab("Day")+
   ylab("Volume")+
   scale_color_manual(values=c("blue", "red"))+
   scale_fill_manual(values=c("blue", "red"))+
   theme_bw()+
   theme(text=element_text(face="bold", size=12),
         panel.grid.major = element_blank(), 
         panel.grid.minor = element_blank(), 
         panel.border = element_rect(fill = NA, colour = "NA"),
         axis.line = element_line(size = 1, colour = "grey80"),
         legend.title=element_blank(),
         legend.position = c(.2, .92))
 Final.Fixed.Plot

enter image description here

Is this ok ? I think that I am still cosnidering the m1 with the country*Day interaction. Correct me if I am worng, please! Also, I don't know how I can add the exp(fit) curve for the whole model and the raw data points in this plot.

Could I get some hint/help, please ?

Rosa Maria
  • 111
  • 5

1 Answers1

2

Clean summary on top

The first code chunk contains a cleaned up version that addresses all points of the question, using some input from the comments. I've left the original answer below which step by step builds to the final plot.


library(tidyverse)
library(lme4)
library(broom.mixed)
library(ggeffects)

m1 <- lme4::lmer(lVolume ~ Country*Day + (1|Rat), data = df_rats %>% 
                   dplyr::mutate(lVolume = log(Volume)))

# predictions for each country
syn_df <- tidyr::expand_grid(
  Day = 1:170,
  Country = c("Chile", "England")
) %>%
  dplyr::mutate(lVolume = predict(m1, ., re.form = ~0)) 

# marginal effects for variable "Day"
df_day_marginal <- ggeffect(model = m1, terms = "Day", type = "fe") %>%
  as.data.frame() %>%
  dplyr::rename(Day = x, lVolume = predicted) %>%
  dplyr::mutate(Country = "overall")

#combine prediction curves
df_preds <- bind_rows(syn_df, df_day_marginal)

# manually assemble formulas [units missing]
y0 <- round(fixef(m1)[["(Intercept)"]], 2)
beta_day <- round(fixef(m1)[["Day"]], 3)
beta_englday <- round(fixef(m1)[["CountryEngland:Day"]], 3)
beta_engl <- round(fixef(m1)[["CountryEngland"]], 2)

f_chile <- paste0("volume = exp(", y0, " + ", beta_day, " * days)")
f_england <- paste0("volume = exp(", y0 + beta_engl , " + ", beta_day + beta_englday, " * days)")

df_labels <- data.frame(
  x = c(50, 50),
  y = c(1300, 1400),
  form = c(f_chile, f_england),
  country = c("Chile", "England")
)


m1 %>% 
  broom.mixed::augment()%>% 
  ggplot(aes(x = Day, y = exp(lVolume), color = Country)) +
  geom_ribbon(data = df_preds, aes(ymin = exp(conf.low), ymax = exp(conf.high), color = NULL, fill = Country), alpha = 0.3) +
  geom_line(data = df_preds, size = 1.5) +
  geom_line(aes(group = Rat)) +
  geom_point() +
  coord_cartesian(ylim  = c(0, 1500), xlim = c(0, 150)) +
  geom_text(data = df_labels, aes(x = x, y = y, label = form, color = country)) +
  labs(x = "days", y = "volume")

enter image description here


original answer

I've tried to stay as close as possible to your initial code for the first part of the question.

The first chunk trains the model and makes population-level predictions for Chile and England over the specified days. (using the re.form = ~0 argument as explained e.g. here)

library(tidyverse)
library(lme4)
library(broom.mixed)

#helpful to specify in that `lVolume` is the log of the data you provid in the question
m1 <- lme4::lmer(lVolume ~ Country*Day + (1|Rat), data = df_rats %>% 
                   dplyr::mutate(lVolume = log(Volume)))

days <- seq(0,168,14)
syn_df <- tidyr::expand_grid(
  Day = 1:170,
  Country = c("Chile", "England")
)

syn_df <- syn_df %>%
  dplyr::mutate(l_volume = predict(m1, syn_df, re.form = ~0)) %>% 
  janitor::clean_names() 

This can then be added to your original plot with minor modifications:

m1 %>% 
  broom.mixed::augment() %>% 
  janitor::clean_names() %>% 
  ggplot(data = .,
         mapping = aes(x = day,
                       y = exp(l_volume),
                       color = country)) +
  geom_point(alpha = 0.7) +
  geom_line(aes(group = rat), alpha = 0.7) +
  expand_limits(x = 0 , y = 0) +
  geom_line(data = syn_df, alpha = 1, size = 1.5) +
  coord_cartesian(ylim  = c(NA, 1500), xlim = c(NA, 150))

enter image description here

Added

In addition, we can add marginal effect for days to the plot.

df_day_marginal <- ggeffect(model = m1, terms = "Day", type = "fe")

m1 %>% 
  broom.mixed::augment() %>% 
  janitor::clean_names() %>% 
  ggplot() +
  geom_ribbon(data = df_day_marginal, aes(x = x, ymin = exp(conf.low), ymax = exp(conf.high)), alpha = 0.3) +
  geom_line(data = syn_df, aes(x = day, y = exp(l_volume), color = country), size = 1.5) +
  geom_line(data = df_day_marginal, aes(x = x, y = exp(predicted)), size = 1.5) +
  geom_point(aes(x = day, y = exp(l_volume), color = country), alpha = 0.7) +
  geom_line(aes(x = day, y = exp(l_volume), color = country, group = rat), alpha = 0.7) +
  expand_limits(x = 0 , y = 0) +
  coord_cartesian(ylim  = c(NA, 1500), xlim = c(NA, 150)) +
  labs(x = "days", y = "volume")

enter image description here

pholzm
  • 1,719
  • 4
  • 11
  • Ohh that looks even better, thanks! but still the slope for the whole model cannot be applied right ? – Rosa Maria Feb 22 '22 at 18:45
  • Not sure what the proper "whole model" would be. I guess you could argue about using the average predictions of Chile & Englang (since it's just two countries, and 3 rats each), or alternatively fit a model that does not account for Country and plot that as comparison. And w.r.t plotting the equations - some combination of coeffient extraction, `glue`and `geom_text` or similar should be a viable option. – pholzm Feb 22 '22 at 18:58
  • Yes, ahmm plotting the m1 : df.plot = ggpredict(model = m1, terms = "Day", type = "fe") ggplot(data = df.plot, mapping = aes(x = x, y = exp(predicted), ymin = exp(conf.low), ymax = exp(conf.high))) + geom_ribbon(fill = "lightblue") + geom_line(size = 1) So here you have the conf interval from the whole model with its slope but from the entire model. How can I integrate this curve into the one you did ? – Rosa Maria Feb 22 '22 at 20:16
  • `ggpredict` produces the predictions for country = "Chile", while `ggeffect` likely gives you what you want [conditional vs marginal effects, see the documentation here](https://strengejacke.github.io/ggeffects/reference/ggpredict.html#difference-between-ggpredict-and-ggeffect-or-ggemmeans-) – pholzm Feb 22 '22 at 20:33
  • 1
    I like this; a minor improvement would be to add `Country = "overall"`, `bind_rows` to combine the data set, so that you could get all three levels (England/Chile/overall) to show up in the legend – Ben Bolker Feb 22 '22 at 21:45
  • 1
    Thanks - I'll clean this up a bit and will also add the formulas – pholzm Feb 22 '22 at 21:47
  • @Rachel I've now added the formulas as well – pholzm Feb 22 '22 at 22:17
  • 1
    the `equatiomatic` package might get the equations you want – Ben Bolker Feb 22 '22 at 22:18
  • Thank you very much @pholzm ! I tried to compute the median of the curves but it gave me crazy curves hehe this mixed-effect models are more complex than I thought. Quick theoretical question, is it possible to catch the lag time of the curve at the beginning ? Like how much time takes to start seeing the curve increasing from the first original raw data that I have from the model ? Thanks again! – Rosa Maria Feb 23 '22 at 10:43
  • I will try to use that package @BenBolker, Thanks! – Rosa Maria Feb 23 '22 at 10:43
  • 1
    @Rachel Short theoretical answer: probably yes, depending on what you want exactly. If you mean to plot results relative to the first observation (day 47) you could add `%>% mutate(delta_days = Day - 47)` to all dataframes used in the plot and map `x` to `delta_days` instead of `Day`. – pholzm Feb 23 '22 at 11:31
  • Sorry to bother again @pholzm, do you know how I can plot the conf interval of the marginal distribution too ? – Rosa Maria Mar 20 '22 at 15:04
  • 1
    if you are looking for confidence intervals for the Chile/England predictions, you could replace the `syn_df` block with `ggpredict(model = m1, terms = c("Day", "Country")) %>% as.data.frame() %>% dplyr::rename(Day = x, lVolume = predicted, Country = group) ` – pholzm Mar 21 '22 at 11:38