1

Here is a dummy code :

library(ggplot2)
library(dplyr)

diamonds |> dplyr::filter(color %in% c("D","E", "F"), cut %in% c("Ideal","Fair"), clarity %in% c("SI2","VS2","IF")) |> ggplot(aes(x = clarity, y =carat,  color=color, shape=cut)) +
stat_summary(fun.data= mean_cl_boot, geom="errorbar", width=0.05, position=position_dodge(0.7)) +
stat_summary(fun=mean, geom="point", size=2, position= position_dodge(0.7))

I would like to connect the means with a line within each clarity category ( ie connect circle to the triangle: shown in red colour on the picture as an example):

enter image description here

If I use geom_stat or geom_line: it gives an error that geom_path: Each group consists of only one observation. Do you need to adjust the group aesthetic? which makes sense since both of them are within a single clarity group. I tried to use group=interaction() but it did not work either, I only were able to do it for points within different clarity groups

yuliaUU
  • 1,581
  • 2
  • 12
  • 33
  • 1
    This code is not reproducible. Would you care correcting it, please? Also, although not an answer to your question, I think your visualisation might become quite overwhelming. A good way to show paired data (your connected pairs) would be to use the two dimensions of your coordinate system. here carat, fair on the x and Ideal on the y axis. You could then show actually all values rather than just mean and error bars, maybe, if too crowded, 2d density estimates etc. – tjebo May 30 '22 at 22:47
  • can you tell me what is not reproducible in my code? – yuliaUU May 31 '22 at 14:37
  • 1
    Not reproducible because `pd` is not assigned – Andrea M May 31 '22 at 14:42
  • tjebo: you just have a conflict in filter function: `dplyr::filter` will fix the issue – yuliaUU May 31 '22 at 15:04

1 Answers1

2

I think best to use a manual dodge

library(ggplot2)
library(dplyr)

df <- diamonds %>% dplyr::filter(color %in% c("D","E", "F"), cut %in% c("Ideal","Fair"), clarity %in% c("SI2","VS2","IF")) 

## make a names vector for your manual dodge 
## this of course needs adjustment depending on your actual data. can be automated
dodge_vec <- seq(-.25, .25, length = 6)
names(dodge_vec) <- unique(with(df, paste(cut, color, sep = "_")))

## some data alterations - assign dodge by subsetting with named vector
df <- df %>%
  mutate(cut_col = dodge_vec[paste(cut, color, sep = "_")]) 
## summarise for your lines 
df_line <- 
  df %>%
  group_by(clarity, cut, color, cut_col) %>%
  summarise(mean_carat = mean(carat))
#> `summarise()` has grouped output by 'clarity', 'cut', 'color'. You can override
#> using the `.groups` argument.

## need to pass your original x as an integer and add your new doding column
ggplot(df, aes(x = as.integer(factor(clarity)) + cut_col, y =carat, color=color, shape=cut)) +
stat_summary(fun.data= mean_cl_boot, geom="errorbar", width=0.05) +
  stat_summary(fun=mean, geom="point", size=2) +
  ## add lines with your new data, using an interaction variable
  geom_line(data = df_line, aes(y = mean_carat, group = interaction( as.integer(clarity), color))) +
  scale_x_continuous(breaks = 1:3, labels = unique(df$clarity))
#> Warning: Using shapes for an ordinal variable is not advised

Your question suggests that you're dealing with paired data, therefore my suggestion in the comment. I wanted to give an example, but the diamond data set doesn't have paired data, thus it would be a bit difficult to fake that.

Created on 2022-05-31 by the reprex package (v2.0.1)

tjebo
  • 21,977
  • 7
  • 58
  • 94
  • this will be useful for my pairwise comparison plots.. – PesKchan Jun 07 '22 at 06:59
  • 1
    @PesKchan I generally don’t really recommend that visualisation. that was more to show technical feasibility. paired data is much more convincingly visualised with scatter plots. see for example my suggestions in https://stackoverflow.com/questions/70397418/create-a-split-violin-plot-with-paired-points-and-proper-orientation or https://stackoverflow.com/a/70323223/7941188 – tjebo Jun 07 '22 at 08:00
  • 1
    first one is what i would incorporate since i have data designed as pre and post treatment for patient groups – PesKchan Jun 07 '22 at 08:15