2

I have a plot in my mind that I would like to create, but I don't know how to successfully achieve this goal.

I have 2 dataframes, one containing the mean value for each factor level, and the other, pairwise differences between these levels.

contrasts <- data.frame(Level1 = c("setosa", "setosa", "versicolor"),
                        Level2 = c("versicolor", "virginica", "virginica"),
                        Diff = c(0.65, 0.46, -0.20),
                        CI_low = c(0.53, 0.35, -0.32),
                        CI_high = c(0.75, 0.56, -0.09))

means <- data.frame(Species = c("setosa", "versicolor", "virginica"),
                    Mean = c(3.42, .77, 2.97))

My goal is to use the means as starting point for a triangle that would "project" onto the level of the corresponding contrast, which height would be equal to the CI (CI_low and CI_high). So that it would look something like that (pardon my paint):

enter image description here

Using the following, I easily added the initial points:

library(tidyverse)

means %>%
  ggplot() + 
  geom_point(aes(x = Species, y= Mean)) + 
  geom_ribbon(data=contrasts, aes(x=Level1, ymin=CI_low, ymax=CI_high))

But I have troubles with adding the triangles. Any ideas? Thanks a lot!

Edit

Thanks to Yuriy Barvinchenko, which provided the code to obtain this:

contrasts %>% 
  bind_cols(id=1:3) %>% 
  inner_join(means, by=c('Level1' = 'Species')) %>% 
  select(id, x=Level1, y=Mean) %>% 
  bind_rows( (contrasts %>% 
                bind_cols(id=1:3) %>% 
                select(id, x=Level2, y=CI_low)),
             (contrasts %>% 
                bind_cols(id=1:3) %>% 
                select(id, x=Level2, y=CI_high))) %>% 
  ggplot(aes(x = x, y= y, group=id)) + 
  geom_polygon()

However, based on the means, I would have expected the middle-level (versicolor) to be the "lowest", whereas in that plot it is virginica which as the lowest value.

sertsedat
  • 3,490
  • 1
  • 25
  • 45
Dominique Makowski
  • 1,511
  • 1
  • 13
  • 30
  • If I understand correctly, you want to project your means of one factor to the confidence intervals of another factor, correct? I would probably go with polygons (`geom_polygon()`), where your x-factors would be `c("A","B","B")` and your y-values the equivalent of `c(Mean_A, CI_High_B, CI_Low_B)` for each pairwise comparison. – teunbrand Apr 25 '19 at 07:29

1 Answers1

3

if I understand your question correctly, you need code like this:

contrasts <- tibble(Level1 = c("setosa", "setosa", "versicolor"),
                        Level2 = c("versicolor", "virginica", "virginica"),
                        Diff = c(0.65, 0.46, -0.20),
                        CI_low = c(0.53, 0.35, -0.32),
                        CI_high = c(0.75, 0.56, -0.09))

means <- tibble(Species = c("setosa", "versicolor", "virginica"),
                                            Mean = c(3.42, .77, 2.97))

library(tidyverse)

contrasts %>% 
  bind_cols(id=1:3) %>% 
  inner_join(means, by=c('Level1' = 'Species')) %>% 
  select(id, x=Level1, y=Mean) %>% 
  bind_rows( (contrasts %>% 
                bind_cols(id=1:3) %>% 
                select(id, x=Level2, y=CI_low)),
             (contrasts %>% 
                bind_cols(id=1:3) %>% 
                select(id, x=Level2, y=CI_high))) %>% 
  ggplot(aes(x = x, y= y, group=id)) + 
  geom_polygon()

Please note, I use tibble() instead of data.frame() in order to avoid factors, for easier joining these tables.

Yuriy Barvinchenko
  • 1,465
  • 1
  • 12
  • 17
  • Thanks a lot! that's very clever! However, I just have a concern regarding the "right" mean values (as the difference between versicolor and virginica should be positive (versicolor < virginica) being negative in the plot (where versicolor > virginica)). How can I address that? – Dominique Makowski Apr 25 '19 at 08:09
  • 1
    Hi Dominique, Sorry, I'm not sure about your idea, so can't help in this question. You can play with your data, it may give you a clue. Good lick! – Yuriy Barvinchenko Apr 25 '19 at 08:17