1

I have a dataset that looks like this. It's saved as a file called "prescriptivism_scores.csv":

Usage_Guide Usage_Problem Prescriptivism_Index
Book 1 who/whom 2
Book 2 who/whom 2
Book 3 who/whom 2.5
Book 4 who/whom 4
Book 5 who/whom 2
Book 6 who/whom 1.5
Book 7 who/whom 3
Book 8 who/whom 2
Book 9 who/whom 4
Book 10 who/whom 4
Book 11 who/whom 2

I used this code

library(ggplot2)

df <- read.csv(file = 'prescriptivism_scores.csv')
ggplot(df, aes(y = Prescriptivism_Index, x = Usage_Problem)) +
  geom_boxplot(color = "#838383") +
  
  geom_dotplot(binaxis = "y", stackdir = "center", dotsize = 0.5, color = "#00853E", fill = "#C4D600", stackratio = 1.5) +
 
  scale_x_discrete(name = "Usage Problem", 
                   breaks = c("different_to_than_from", "I_for_me", "lay_lie", "less_fewer", "none", "singular_they", "split_infinitive", "who_whom"),
                   labels = c("DIFF TO/THAN/FROM", "I FOR ME", "LAY/LIE", "LESS/FEWER", "NONE", "SG THEY", "SPLIT INF", "WHO/WHOM")) +
  
  ylab("Prescriptivism Index") +
  
  stat_summary(fun.y = mean, geom="point", shape=3, size=3, color="#EF4B81") +
 
  theme(panel.background = element_blank()) +
  
  geom_hline(yintercept = 1, linetype = "dashed", linewidth = 0.25, color = "darkgray") +
  geom_hline(yintercept = 2, linetype = "dashed", linewidth = 0.25, color = "darkgray") +
  geom_hline(yintercept = 3, linetype = "dashed", linewidth = 0.25, color = "darkgray") +
  geom_hline(yintercept = 4, linetype = "dashed", linewidth = 0.25, color = "darkgray")

to create this box plot

Box plot with no overlapping data points

I'm happy with everything about this box plot except for one thing: I want each dot in the plot to be a different shape to represent the different books in the "Usage_Guide" column of my data. I want to do this so I know which data points correspond to which books.

I've tried adding "shape = Usage_Guide" to the aes() function.

ggplot(df, aes(y = Prescriptivism_Index, x = Usage_Problem, shape = Usage_Guide)) +

But when I do, the dots don't actually change shape, and the dotplot changes so that the dots are overlapping. It also adds dashes to the plot:

Box plot with no box, overlapping data points, and dashes added

If I try to change the color of the dots instead of the shape, I get closer to my end goal, but strange things also happen.

For example, adding a color call to the aes() function and changing the fill to white in the geom_dotplot() function, as shown in the code below, changes the colors of the dots and maintains the box plots, but it causes the data points to overlap.

ggplot(df, aes(y = Prescriptivism_Index, x = Usage_Problem, color = Usage_Guide)) +
  geom_boxplot(color = "#838383") +
  
  geom_dotplot(binaxis = "y", stackdir = "center", dotsize = 0.5, fill = "#FFFFFF", stackratio = 1.5) +

Box plot with colored overlapping data points

But just reversing the fill and outline so the fill call is in the aes() function the color call is in the geom_dotplot() function breaks something so that the box plots no longer show.

ggplot(df, aes(y = Prescriptivism_Index, x = Usage_Problem, fill = Usage_Guide)) +
  geom_boxplot(color = "#838383") +
  
  geom_dotplot(binaxis = "y", stackdir = "center", dotsize = 0.5, color = "#FFFFFF", stackratio = 1.5) +

Box plot with no box and overlapping colored data points

How can I maintain the look of my original box plot, but with different shapes for each data point?

  • Does this related question help? https://stackoverflow.com/questions/55955847/how-can-i-change-the-shape-in-geom-dotplot-and-see-all-the-dot-is-the-same-frame – Jon Spring Dec 27 '22 at 18:46
  • I see that geom_dotplot does not list `shape` as one of the aesthetics it understands. I suspect there is not a way to independently group by one variable but display dots within a group differently. – Jon Spring Dec 27 '22 at 18:48
  • With `geom_dotplot` this seems not to be possible: – TarJae Dec 27 '22 at 18:48
  • @JonSpring, thanks so much for sharing the link! The beeswarm package seems to be what I'm looking for. I'm still running into an issue, though, when I try to add the shape call to the aes function, it adds the shapes, but it still removes the box plot (it doesn't remove the box plot when I add a color call. Any ideas what I might be doing wrong? – Jordan Smith Dec 27 '22 at 20:45
  • If the shapes relate to the points but not the boxplot, use `aes(shape = ...)` inside the geoms that need it, but leave that out of your global `aes()` call in the `ggplot()` line. – Jon Spring Dec 27 '22 at 21:22

0 Answers0