1

This is the plot I want:

plot I want

This is my code so far (but it does not give me the results I want):

color_scale_class <- c("#00aedb", "#ff6f69", "#ffcc5c", "#cbe885", "#88d8b0")   
ggplot(fluct_all, aes(x=type, y=test_acc, color=as.factor(fold), fill=type)) +
  stat_boxplot(geom = "errorbar") +
  geom_boxplot(outlier.alpha = 0.01, outlier.size = 0.75) +
  scale_fill_manual(values = color_scale_class,
                    name = "Input Data\nCombination",
                    labels = c("TLS", "TLS & GEO", "TLS & RGB", "ALL")) +
  scale_color_manual(values=rep("black", 5)) +
  theme_light() +
  theme(text = element_text(size = 14, family="Calibri")) +
  scale_x_discrete(labels = c("TLS", "TLS & GEO", "TLS & RGB", "ALL")) +
  xlab("") +
  ylab("Test Accuracy\n") +
  guides(color = "none")

However, I have several issues: I want the whiskers to have small horizontal lines. I tried to do it by stat_boxplot(geom = "errorbar") but this only works when I keep the width on default. I want a width of 0.2 though. But when I do this, the errorbars do not horizontally align with the boxplots.

My second issue: I want one boxplot for each "fold", but I want the colors to be based on the "type". I managed to get the seperate plots by using fill=as.factor(fold), but then I don't manage to get colors referring to the "type".

I would like to avoid using facets, since I want everything on one plot. Any help is highly appreciated! Basically my issue is that I want several boxplots based on one variable, which I achieve with fill, but I want the color to be referring to another variable actually.

I have some dummy data here:

fluct_all <- data.frame(
  "type" = rep(c("tls", "tls_rgb", "tls_geo", "tls_rgb_geo"), each=50),
  "fold" = rep(1:5, 10),
  "test_acc" = rnorm(200))

Edit: I just got an idea and fixed the color issue (sorry!), but my whiskers of stat_boxplot still do not align with geom_boxplot.

Zoe
  • 906
  • 4
  • 15
  • 1
    The "dodge width" needs to be the same for both boxes and error bars. Set `position` in both to be e.g. `pd = position_dodge(width = 0.7)`. For a more thorough explanation see [What is the width argument in position_dodge?](https://stackoverflow.com/questions/34889766/what-is-the-width-argument-in-position-dodge) – Henrik Aug 04 '21 at 14:23
  • 1
    For the first (removed) question, you can use the `group` argument in `aes`: `ggplot(fluct_all, aes(x = type, y = test_acc, group = interaction(type, fold), fill = type)) + geom_boxplot()` – Henrik Aug 04 '21 at 14:31
  • Thank you! It works perfectly with `position_doge()`, I did not know this function. I will also use the "group" thing, it looks nicer in my code. Thank you very much! – Zoe Aug 04 '21 at 14:34

0 Answers0