ggplot: color points by density as they approach a specific value?

Question

I have a dataset containing 1,000 values for a model, these values are all within the same range (y=40-70), so the points overlap a ton. I'm interested in using color to show the density of the points converging on a single value (y=56.72) which I have indicated with a horizontal dashed line on the plot below. How can I color these points to show this?

ggplot(data, aes(x=model, y=value))+ 
geom_point(size=1) + 
geom_hline(yintercept=56.72, 
           linetype="dashed", 
            color = "black")

hi, I have no access to R right now ... but as a start you could do something like count the values on certain points and then use the count as fill or something like that :) — sambold, Jul 03 '20 at 20:05
See also [my answer here](https://stackoverflow.com/a/58523956/1870254) on how to easily color points by an estimated density. — jan-glx, Sep 25 '20 at 13:36

score 2 · Accepted Answer · edited Sep 25 '20 at 13:33

I think that you should opt for an histogram or density plot:

n <- 500
data <- data.frame(model= rep("model",n),value =  rnorm(n,56.72,10))

ggplot(data, aes(x = value, y = after_stat(count))) +
  geom_histogram(binwidth = 1)+
  geom_density(size = 1)+
  geom_vline(xintercept = 56.72, linetype = "dashed", color = "black")+
  theme_bw()

Here is your plot with the same data:

ggplot(data, aes(x = model, y = value))+ 
  geom_point(size = 1) + 
  geom_hline(yintercept = 56.72, linetype = "dashed", color = "black")

If your model is iterative and do converge to the value, I suggest you plot as a function of the iteration to show the convergence. An other option, keeping a similar plot to your, is dodging the position of the points :

ggplot(data, aes(x = model, y = value))+ 
  geom_point(position = position_dodge2(width = 0.2),
             shape = 1,
             size = 2,
             stroke = 1,
             alpha = 0.5) + 
  geom_hline(yintercept = 56.72, linetype = "dashed", color = "black")

Here is a color density plot as you asked:

library(dplyr)
library(ggplot2)
data %>%
  mutate(bin = cut(value, breaks = 10:120)) %>%
  dplyr::group_by(bin) %>%
  mutate(density = dplyr::n()) %>%
  ggplot(aes(x = model, y = value, color = density))+ 
  geom_point(size = 1) + 
  geom_hline(yintercept = 56.72, linetype = "dashed", color = "black")+
  scale_colour_viridis_c(option = "A")

Thanks for all of the options @denis! The color density throws an error: Error: `n()` must only be used inside dplyr verbs. Is there a work around this? — mef022, Jul 03 '20 at 23:45
yes, use `dplyr::n()`. It is because you have `plyr` loaded too. — denis, Jul 04 '20 at 07:57

score 0 · Answer 2 · answered Jul 03 '20 at 21:06

0

I would suggest to use the alpha parameter within the geom_point. You should use a value close to 0.

ggplot(data, aes(x=model, y=value)) + 
  geom_point(size=1, alpha = .1) + 
  geom_hline(yintercept=56.72, linetype="dashed", color = "black")

answered Jul 03 '20 at 21:06

eastclintw00d

2,250
1
9
18

ggplot: color points by density as they approach a specific value?

2 Answers2