2

I am very puzzled. When using ggplot2, many uses geom_jitter to add points to boxplots for instance. It is supposed to keep values on the Y-axis, at least as far as I know, and jitter values on the X-axis.

Using it today on two groups, 3 points per group, all the same values, I see it jitters values on the Y-axis.

library(ggplot2)

condition = c(rep("A", 3), rep("B", 3))
fraction = c(rep(100, 3), rep(100, 3))
df = data.frame(condition, fraction)


ggplot(df, aes(condition, fraction))+
  geom_jitter(width = 0.2)+
  labs(title = "",
       x = "", y = "fraction")+
  ylim(95,105)+
  theme_classic()

Graph is below (sorry too new to post an image apparently, so that's a link):

graph resulting from the code

Anyone?

2 Answers2

3

Welcome to SO.

geom_jitter() jitters both horizontally and vertically. To avoid vertical jitter, set height = 0.

geom_jitter(height = 0, seed = 123)

enter image description here

Martin C. Arnold
  • 9,483
  • 1
  • 14
  • 22
  • Thanks Martin! I was not aware. Pretty dangerous and confusing to not have height = 0 by default, at least for me. Sorry I cannot +1, too new... – PM_from_Mars Sep 12 '22 at 13:43
  • 1
    You're welcome. It certainly depends on the visualisation type. For a 2D scatterplot it's actually a plausible choice if we are interested in reducing [overplotting](https://www.displayr.com/what-is-overplotting/). – Martin C. Arnold Sep 12 '22 at 13:47
  • And a reasonable explanation! Thank you very much! – PM_from_Mars Sep 12 '22 at 13:55
2

According to documentation of geom_jitter it can jitter in both dimensions. You can use width and height arguments to specify whether you want it to happen. Check out the example below

ggplot() +
    geom_point(data = mpg[1, ], aes(cty, hwy), color = "red") +
    geom_jitter(data = mpg[1, ], aes(cty, hwy), width = 0.5, height = 0.5, color = "blue") + 
    geom_jitter(data = mpg[1, ], aes(cty, hwy), width = 0.5, height = 0, color = "green")

Green dot always has same y-value as black dot (original) due to seting height to 0.