I've just started working with R and trying to find out how to add mean and median labels on a box plot using ggplot.
I have a dataset: Unit, Quarter, # of Days:
dset <- read.table(text='Unit Quarter Days Z
HH 1Q 25 Y
PA 1Q 28 N
PA 1Q 10 Y
HH 1Q 53 Y
HH 1Q 12 Y
HH 1Q 20 Y
HH 1Q 43 N
PA 1Q 11 Y
PA 1Q 66 Y
PA 1Q 54 Y
PA 2Q 19 N
PA 2Q 46 Y
PA 2Q 37 Y
HH 2Q 22 Y
HH 2Q 67 Y
PA 2Q 45 Y
HH 2Q 48 Y
HH 2Q 15 N
PA 3Q 12 Y
PA 3Q 53 Y
HH 3Q 58 Y
HH 3Q 41 N
HH 3Q 18 Y
PA 3Q 26 Y
PA 3Q 12 Y
HH 3Q 63 Y
', header=TRUE)
I need to show data by Unit and Quarter and create a boxplot displaying mean and median values.
My code for a boxplot:
ggplot(data = dset, aes(x = Quarter
,y = Days, fill = Quarter)) +
geom_boxplot(outlier.shape = NA) +
facet_grid(. ~ Unit) + # adding another dimension
coord_cartesian(ylim = c(10, 60)) + #sets the y-axis limits
stat_summary(fun.y=mean, geom="point", shape=20, size=3, color="red", fill="red") + #adds average dot
geom_text(data = means, aes(label = round(Days, 1), y = Days + 1), size = 3) + #adds average labels
geom_text(data = medians, aes(label = round(Days, 1), y = Days - 0.5), size = 3) + #adds median labels
xlab(" ") +
ylab("Days") +
ggtitle("Days") +
theme(legend.position = 'none')
I can use geom_text function to add mean and median labels but only for one dimension ("Quarter") and it requires calculation of mean and median variables beforehand:
means <- aggregate(Days ~ Quarter, dset, mean)
medians <- aggregate(Days ~ Quarter, dset, median)
It works pretty good and I managed to calculate mean and median values by both "Unit" and "Quarter":
means <- aggregate(dset[, 'Days'], list('Unit' = dset$Unit, 'Quarter' = dset$Quarter), mean)
medians <- aggregate(dset[, 'Days'], list('Unit' = dset$Unit, 'Quarter' = dset$Quarter), median)
but I do not know how to pass those variables to geom_text function to display lables for the mean and median. Maybe I should calculate mean and median in a different way or there are other options how to add those labels.
Would be grateful for any suggestions!