0

I have eg data and syntax for a scatter (jitter) plot below

eg_data <- data.frame(
period = c(sample( c("1 + 2"), 1000, replace = TRUE)),
max_sales = c(sample( c(1,2,3,4,5,6,7,8,9,10), 1000, replace = TRUE, prob = 
c(.20, .10, .15, .20, .15, .10, .05, .02, .02, .01))) )

jitter <-  (
(ggplot(data = eg_data, aes(x=period, y=max_sales)) +
geom_jitter(stat = "identity", width = .15, color = "blue", alpha = .4)) +
scale_y_continuous(breaks= seq(0,12, by=1)) +
stat_summary(fun.y = "quantile", fun.args = list(probs = c(0.25)), geom = "hline", aes(yintercept = ..y..), colour = "red", size = 1) +
stat_summary(fun.y = "mean", geom = "hline", aes(yintercept = ..y..), colour = "gold", size = 1) +
stat_summary(fun.y = "quantile", fun.args = list(probs = c(0.50)), geom = "hline", aes(yintercept = ..y..), colour = "blue", size = 1) +
stat_summary(fun.y = "quantile", fun.args = list(probs = c(0.75)), geom = "hline", aes(yintercept = ..y..), colour = "black", size = 1) +
stat_summary(fun.y = "quantile", fun.args = list(probs = c(0.90)), geom = "hline", aes(yintercept = ..y..), colour = "green", size = 1) +
ggtitle("Max Sales x Period 1 and 2") + xlab("Period") + ylab("Sales") +
theme(plot.title = element_text(color = "black", size = 14, face = "bold", hjust = 0.5),
      axis.title.x = element_text(color = "black", size = 12, face = "bold"), 
      axis.title.y = element_text(color = "black", size = 12, face = "bold")) +
labs(fill = "Period") )
jitter

I cannot find documentation on how to define a legend for the horiztonal quantile / mean lines I have in this graph.

How to add legend to ggplot manually? - R

I came across this SO question / answer but I wasn't able to implement it, when I include color inside the aes setting, it doesn't work.

EDIT - a member suggested I add color to the aes specification...here is the same graph with color and size included.

jitter2 <-  (
(ggplot(data = eg_data, aes(x=period, y=max_sales)) +
geom_jitter(stat = "identity", width = .15, color = "blue", alpha = .4)) +
scale_y_continuous(breaks= seq(0,12, by=1)) +
stat_summary(fun.y = "quantile", fun.args = list(probs = c(0.25)), geom = "hline", aes(yintercept = ..y.., colour = "red"), size = 1) +
stat_summary(fun.y = "mean", geom = "hline", aes(yintercept = ..y.., colour = "gold"), size = 1) +
stat_summary(fun.y = "quantile", fun.args = list(probs = c(0.50)), geom = "hline", aes(yintercept = ..y.., colour = "blue"), size = 1) +
stat_summary(fun.y = "quantile", fun.args = list(probs = c(0.75)), geom = "hline", aes(yintercept = ..y.., colour = "black"), size = 1) +
stat_summary(fun.y = "quantile", fun.args = list(probs = c(0.90)), geom = "hline", aes(yintercept = ..y.., colour = "green"), size = 1) +
ggtitle("Max Sales x Period 1 and 2") + xlab("Period") + ylab("Sales") +
theme(plot.title = element_text(color = "black", size = 14, face = "bold", hjust = 0.5),
      axis.title.x = element_text(color = "black", size = 12, face = "bold"), 
      axis.title.y = element_text(color = "black", size = 12, face = "bold")) +
labs(fill = "Period") )
jitter2

So...any help is appreciated. Thank you!

Adam_S
  • 687
  • 2
  • 12
  • 24
  • 3
    Why don't you use boxplot? It shows same quantile information and is understandable for everyone. – pogibas Dec 17 '18 at 15:58
  • "I came across this SO question / answer" -- Looks like you forgot the link – camille Dec 17 '18 at 16:06
  • In order to get a legend, you'd have to have something assigned to an aesthetic, such as color – camille Dec 17 '18 at 16:07
  • 1
    @PoGibas - a boxplot is absolutely not understandable by everyone. In ten years of analytical work, my experience is the general public doesn't get them, at all. But they do get lines, which is why I asked for help. – Adam_S Dec 17 '18 at 16:38
  • @camile, I edited the question and posted the link, sorry. I also edited the question, adding a second jitter with color and size included within the aes parameter, to show why that doesn't work for me. – Adam_S Dec 17 '18 at 16:38

1 Answers1

0

I found the answer to my own question.

jitter <-  (
(ggplot(data = eg_data, aes(x=period, y=max_sales)) +
geom_jitter(stat = "identity", width = .15, color = "blue", alpha = .4)) +
scale_y_continuous(breaks= seq(0,12, by=1)) +
stat_summary(fun.y = "quantile", fun.args = list(probs = c(0.25)), geom = "hline", aes(yintercept = ..y.., colour = "25%"), size = 1) +
stat_summary(fun.y = "quantile", fun.args = list(probs = c(0.50)), geom = "hline", aes(yintercept = ..y.., colour = "50%"), size = 1) +
stat_summary(fun.y = "quantile", fun.args = list(probs = c(0.75)), geom = "hline", aes(yintercept = ..y.., colour = "75%"), size = 1) +
stat_summary(fun.y = "quantile", fun.args = list(probs = c(0.90)), geom = "hline", aes(yintercept = ..y.., colour = "90%"), size = 1) +
stat_summary(fun.y = "mean", geom = "hline", aes(yintercept = ..y.., colour = "mean"), size = 1.5) +
ggtitle("Max Sales x Period 1 and 2") + xlab("Period") + ylab("Sales") +
theme(plot.title = element_text(color = "black", size = 14, face = "bold", hjust = 0.5),
      axis.title.x = element_text(color = "black", size = 12, face = "bold"), 
      axis.title.y = element_text(color = "black", size = 12, face = "bold")) +
scale_colour_manual(values = c("red", "blue", "gold", "green", "black"), name = "Percentiles"))
jitter

Also, quickly, the idea of "just use (something), everyone gets it" as a suggestion is not helpful, and assumes way too much about the final intended audience. First time I've ever posted a question and had that as a reply. I asked a specific question for a specific reason.

Adam_S
  • 687
  • 2
  • 12
  • 24