-2

I would like to display the number of observations per group in a graph that reports the value of a variable over time.

this is my current script, thank you in advance!

ggplot(df, aes(x=time_nrp, y=lac, color=as.factor(outcome)))+ 
  geom_point(stat="summary", fun=mean, size=4)+
  geom_line(stat="summary", fun=mean, aes(group=as.factor(outcome)))+
  stat_summary(fun.data=mean_se, geom="errorbar", width=0.01)+
  xlab("NRP Time (hours)")+
  ylab("Lactate (mmol/l)")+
  geom_text(aes(y = 0,label = lac),vjust = 0)+
  theme_bw()+
  ggtitle("Panel A")+ 
  theme(plot.title = element_text(hjust=0.5))+
  theme(text=element_text(family="Helvetica", size=20))+
  scale_color_manual(name="Outcome", 
                     breaks=c("0", "1"),
                     labels=c("Negative", "Positive"),
                     values = c("#E12000", "#002F80"))

this is an example of my data, I would like to display the number of observations per group next to the time point of the two lines indicating the groups.

id  group time lactate     
 1     A   1   1.2
 1     A   2   1.1
 1     A   3   1.3
 2     B   1   0.8
 2     B   2   0.7
 2     B   3   0.9
 3     A   1   0.7
 3     A   2   0.9
 3     A   3   1.3
 4     B   1   0.5
 4     B   2   0.6
 4     B   3   0.7 



Irene
  • 3
  • 2
  • 2
    Welcome to SO! It would be easier to help you if you provide [a minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) including a snippet of your data or some fake data. It would also be useful to know where you want to "display the number of observations per group". – stefan Jul 13 '22 at 10:05
  • yes, sorry, I provided an example of my data – Irene Jul 19 '22 at 08:36

1 Answers1

0

Still not sure about your desired result but maybe this helps. Instead of using stat_summary the pragmatic approach is quite often to do the calculations outside of ggplot, i.e. use an aggregated dataframe with the labels or counts and the position. As you said you want the "number of observations per group next to the time point" I also added the mean of lactate to the dataframe to place the number of obs next to the points.

Note: I opted for geom_label as by default it adds some padding around the label.

library(dplyr)
library(ggplot2)

df_labels <- df %>%
  group_by(time, group) %>%
  summarise(mean_lactate = mean(lactate), n = n(), .groups = "drop")
  
ggplot(df, aes(x = time, y = lactate, color = as.factor(group))) +
  geom_point(stat = "summary", fun = mean, size = 4) +
  geom_line(stat = "summary", fun = mean, aes(group = as.factor(group))) +
  stat_summary(fun.data = mean_se, geom = "errorbar", width = 0.01) +
  xlab("NRP Time (hours)") +
  ylab("Lactate (mmol/l)") +
  geom_label(data = df_labels, aes(y = mean_lactate, label = n), vjust = 1, hjust = 0, label.size = 0, fill = NA) +
  theme_bw() +
  ggtitle("Panel A") +
  theme(plot.title = element_text(hjust = 0.5)) +
  theme(text = element_text(family = "Helvetica", size = 20)) +
  scale_color_manual(
    name = "Outcome",
    breaks = c("0", "1"),
    labels = c("Negative", "Positive"),
    values = c("#E12000", "#002F80")
  )

DATA

df <- structure(list(id = c(
  1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L,
  4L, 4L
), group = c(
  "A", "A", "A", "B", "B", "B", "A", "A", "A",
  "B", "B", "B"
), time = c(
  1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L,
  1L, 2L, 3L
), lactate = c(
  1.2, 1.1, 1.3, 0.8, 0.7, 0.9, 0.7, 0.9,
  1.3, 0.5, 0.6, 0.7
)), class = "data.frame", row.names = c(
  NA,
  -12L
))
stefan
  • 90,330
  • 6
  • 25
  • 51