0

Is there an easy way to add to a geom_point() plot the mean plus the sd like this here:

img

Going further it would be cool to also take into account levels of a factor. My data looks like this:

 str(df)
'data.frame':   138 obs. of  7 variables:
 $ Measurement_type: Factor w/ 3 levels "block_w_same_oil",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ BDV             : num  45.2 64 77 70.2 67.9 55.7 59.8 67.4 75.1 75.2 ...
 $ Temp            : Factor w/ 2 levels "cold","warm": 1 1 1 1 1 1 1 1 1 1 ...
 $ Temp_C          : num  20.1 20.1 20.1 20.1 20.1 20.1 20.1 20.5 20.5 20.5 ...
 $ Pollution       : Factor w/ 2 levels "clean","polluted": 1 1 1 1 1 1 1 1 1 1 ...
 $ Step            : num  1 2 3 4 5 6 1 2 3 4 ...
 $ Rep             : Factor w/ 5 levels "M1","M2","M3",..: 1 1 1 1 1 1 2 2 2 2 ...

I would like to be able to create such plots easily for e.g. the factor Measurement_type and Rep. But maybe also for Pollution and Temp. Is there a built-in feature so I don't have to calculate any means, sd and merge data frames on my own?

What I have atm is:

df %>%
            ggplot(aes(x = Step, y = BDV, colour = Measurement_type, shape = Rep), alpha = 0.8) + 
            geom_point(aes(colour = Measurement_type), size = 3) +
            stat_summary(fun.data = 'mean_sdl', geom = 'smooth') +
            xlab("Step") + ylab("BDV / kV") +
            theme_tq()

which produces

img

which actually does the job but is not really usable as the visualization is not great (plus the sds like in geom_ribbon are not even there, yet).

Ben
  • 1,432
  • 4
  • 20
  • 43
  • 1
    In general a `geom_line()` for the mean + a `geom_ribbon` for the confidence bands. For more help I would suggest to provide [a minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) including the code you have tried so far and a snippet of your data or some fake data. – stefan Feb 01 '23 at 20:37
  • but for this I have to calculate mean and confidence bands on my own, or? – Ben Feb 01 '23 at 20:43

1 Answers1

1

One option would be to use two stat_summary layers to add the mean line and the confidence bands. If you want lines and ribbons for interaction of Rep and Measurement_type then drop the group aes.

Using some fake random example data:

library(ggplot2)

set.seed(123)

df <- data.frame(
  Measurement_type = sample(LETTERS[1:3], 100, replace = TRUE),
  Rep = sample(letters[1:5], 100, replace = TRUE),
  Step = sample(seq(5), 100, replace = TRUE),
  BDV = runif(100, 25, 75)
)

ggplot(df, aes(x = Step, y = BDV, colour = Measurement_type, shape = Rep), alpha = 0.8) +
  stat_summary(aes(
    fill = Measurement_type,
    group = Measurement_type
  ), fun.data = "mean_se", geom = "ribbon", alpha = .3, color = NA) +
  stat_summary(aes(group = Measurement_type), fun.data = "mean_se", geom = "line") +
  geom_point(size = 3) +
  xlab("Step") +
  ylab("BDV / kV")

EDIT

ggplot(df, aes(x = Step, y = BDV, shape = Rep), alpha = 0.8) +
  stat_summary(aes(
    fill = Measurement_type,
    group = Measurement_type
  ), fun.data = "mean_se", geom = "ribbon", alpha = .3, color = NA) +
  stat_summary(aes(
    fill = Rep,
    group = Rep
  ), fun.data = "mean_se", geom = "ribbon", alpha = .3, color = NA) +
  stat_summary(aes(colour = Measurement_type, group = Measurement_type), fun.data = "mean_se", geom = "line") +
  stat_summary(aes(colour = Rep, group = Rep), fun.data = "mean_se", geom = "line") +
  geom_point(aes(colour = Measurement_type), size = 3) +
  xlab("Step") +
  ylab("BDV / kV")

enter image description here

stefan
  • 90,330
  • 6
  • 25
  • 51
  • Thank you. Is it possible to come up with, in your example, eight lines in total? 3 for measurement_type and 5 for reps (in parallel)? So in total 8 lines of 8 different colors? – Ben Feb 02 '23 at 07:24
  • 1
    Almost everything is possible using ggplot2. :D See my edit. But I#m not sure whether that makes that much sense. IMHO the plot gets cluttered and we also mix and map two different variables on the same aesthetic. – stefan Feb 02 '23 at 08:59
  • 1
    yeah, that's true! Because of that I realize that a facet_grid makes probably more sense! But thanks though! – Ben Feb 02 '23 at 09:20