5

Is there a way to color the jitter points on a boxplot based on a numeric value, so for example:

ggplot(data, aes(y = y_var, x = x_var)) +
  geom_jitter(size = 2, aes(color = ifelse(y_var < 5, "red", "black)))

I've added this reproducible example that doesn't quite work (the colors on the plot don't correspond to the jitter call):

a <- rnorm(100, mean = 5, sd = 1)
b <- as.factor(sample(0:1, 100, replace = TRUE))
test_data <- data.frame(cbind(a,b))
test_data$b <- as.factor(test_data$b)

ggplot(test_data, aes(y = a, x = b)) + 
  geom_boxplot()+
  geom_jitter(aes(color = ifelse(a < 5, "red", "black")))

enter image description here

tjebo
  • 21,977
  • 7
  • 58
  • 94
mrpargeter
  • 339
  • 3
  • 12
  • 1
    Thanks for the reply. Not quite, see my reproducible example above. – mrpargeter Jul 03 '18 at 16:48
  • 1
    You can use `scale_color_identity` to set the colors as in [this answer](https://stackoverflow.com/a/15804570/2461552). Or move `color` outside `aes()`: `color=ifelse(test_data$a<5,"red","black")` if things are simple. – aosmith Jul 03 '18 at 17:00
  • Ah yes, i'll follow the scale_color_identity, thanks! – mrpargeter Jul 03 '18 at 17:01

1 Answers1

3

Listing names of colors in your geom as you did doesn't tell the color scale what colors to use—it just breaks values into categories. The strings "red" or "black" don't necessarily have any meaning there. If you want to assign colors inside a geom, give the color names or hex codes you're using, then add scale_color_identity so there's an indication that "red" actually means "make this have a red color," etc.

library(tidyverse)

ggplot(test_data, aes(y = a, x = b)) +
  geom_boxplot() +
  geom_jitter(aes(color = ifelse(a < 5, "red", "black"))) +
  scale_color_identity()

Better yet (and more scaleable and maintainable) is a separation of concerns: let geoms handle creating geometries and mapping onto scales, and let scales handle setting what scales look like. You can use a < 5 as the variable (kind of a proxy variable, since it isn't in your data frame) which will take on true or false values. Then use a color scale, such as scale_color_manual, to set colors based on true or false values.

ggplot(test_data, aes(y = a, x = b)) +
  geom_boxplot() +
  geom_jitter(aes(color = a < 5)) +
  scale_color_manual(values = c("TRUE" = "red", "FALSE" = "black"))

Created on 2018-07-03 by the reprex package (v0.2.0).

camille
  • 16,432
  • 18
  • 38
  • 60