0

I am attempting to make boxplots of some complex data. I have sorted the classes by one particular field (not the class field) and would now like to be able to label each box with the value of that sort-by field. I know from the way the data is structured that the value of this sort-by attribute will be the same for every observation within the class, and I would like to essentially annotate the chart with this additional piece of information.

I thought of trying to accomplish this by adding a point layer to the plot and then labeling those points. I attempted to do this using code like this example I mocked up using the mtcars data set for reproducability. For the sake of this example pretend that the variable gears would be the same for each distinct value of cyl. The "gear/1000000" part is just to get the labels all near the axis.

mtcars %>% group_by(cyl) %>%
ggplot(aes(x = reorder(cyl, gear), y = mpg)) + 
geom_point(show.legend = FALSE, aes(x = reorder(cyl, gear), y = gear/1000000)) +
geom_text(aes(label = gear)) +
geom_boxplot(aes(colour=carb),varwidth = TRUE)

Output of my code

I feel like this is close, but this code is putting the labels on the boxplots instead of on the points, which is the opposite of what I'm looking for. How can I ask ggplot to label only the points from geom_point()? Or is there an easier way to accomplish my objective?

EDIT: Here is what my plot now looks like, thanks to the answer provided below. Boxplots of IRI distribution for various pavement segments

Matt H
  • 13
  • 5
  • I would identify the outliers in your dataset and base your labeling on whether or not it's an outlier, similar to this post: http://stackoverflow.com/questions/33524669/labeling-outliers-of-boxplots-in-r – Ryan Morton Mar 06 '17 at 20:45

1 Answers1

0

Set a separate x and y aes for geom_text. In your code, you are plotting a label for each x,y in aes(x = reorder(cyl, gear), y = mpg) as that is the aes set in the parent ggplot. Instead, set a fixed y (offset by a given amount from your geom_point y value), and x (corresponding to the x value from your geom_point) inside geom_text:

For example (note: there is more than one gear value per cylinder as you stated)

mtcars %>% group_by(cyl) %>%
  ggplot(aes(x = reorder(cyl, gear), y = mpg)) + 
  geom_point(show.legend = FALSE, aes(x = reorder(cyl, gear), y = gear/1000000)) +
  geom_boxplot(aes(colour=carb),varwidth = TRUE) +
  geom_text(aes(label = gear, x = reorder(cyl, gear), y = gear/1000000 - 2))
Djork
  • 3,319
  • 1
  • 16
  • 27