1

I made a scatter plot by ggplot2 like it

example

but I want to color the density of the dots, I tried adding alpha value but it can not indicate the density well. So how to color the overlapping dots based on their counts?

The data I used looks contain 0.1 million numbers(range from 0 to 1) like this (the first column is x and the second is y):

0.07    0.04
0.02    0.12
0.00    0.03
0.14    0.10

I added alpha value and the plot looks like:

+alpha

The code:

library(ggplot2)
p <- ggplot(file, aes(X1,X2)) + geom_point(size=1,alpha = 0.1)
p + labs(x= " " , y=" ", title=" ") + xlim(0.0,1.0) + ylim(0.0,1.0)
OTStats
  • 1,820
  • 1
  • 13
  • 22
LiSAAA
  • 31
  • 1
  • 4
  • 1
    can you share a reproducible example with subset of your data or toy data? – Sal Aug 01 '17 at 06:47
  • I show some data, it has 100k coordinates... – LiSAAA Aug 01 '17 at 07:09
  • I think what @Sal meant was that you should provide a [reproducible R example](https://stackoverflow.com/q/5963269/3250126). Please provide the Code that lead to the presented plot and share your data (or parts of it) with `?dput` – loki Aug 01 '17 at 07:16

3 Answers3

3

There is a library that does this well, called ggpointdensity.

It avoids the lack of smoothness for binned plots, and requires no extra calculation of density.

Example from the README:

library(ggplot2)
library(dplyr)
library(viridis)
library(ggpointdensity)

dat <- bind_rows(
  tibble(x = rnorm(7000, sd = 1),
         y = rnorm(7000, sd = 10),
         group = "foo"),
  tibble(x = rnorm(3000, mean = 1, sd = .5),
         y = rnorm(3000, mean = 7, sd = 5),
         group = "bar"))

ggplot(data = dat, mapping = aes(x = x, y = y)) +
  geom_pointdensity() +
  scale_color_viridis()

slhck
  • 36,575
  • 28
  • 148
  • 201
2

To convey the information of density, a dot-plot or a scatter-plot may be suboptimal as alpha is really hard to identify.

Have a look at either hexplots (http://ggplot2.tidyverse.org/reference/geom_hex.html) or heatmaps (http://ggplot2.tidyverse.org/reference/geom_bin2d.html) in your case.

As I don't know your data, I will just use ggplot2s diamond-dataset. You can create the aforementioned plots like this (both examples are taken from the documentation):

library(ggplot2)
ggplot(diamonds, aes(carat, price)) +
 geom_hex()

Or like this


library(ggplot2)
ggplot(diamonds, aes(carat, price)) +
 geom_bin2d(bins = 100)

Addendum

I just noticed, that your second question regards the color breaks. To allow this use scale_fill_viridis_c(breaks = c(100, 500, 1500, 2500, 4000)) for this effect.

ggplot(diamonds, aes(carat, price)) +
  geom_bin2d(bins = 100) + 
  scale_fill_viridis_c(breaks = c(100, 500, 1500, 2500, 4000))

Created on 2020-04-20 by the reprex package (v0.3.0)

David
  • 9,216
  • 4
  • 45
  • 78
  • I have tried to use the geom_hex before but looks bad (all are the same color... How to adjust the range of count? – LiSAAA Aug 01 '17 at 07:30
  • You could, for example, use `geom_hex(bins = 10)` or some other bin-number. – David Aug 01 '17 at 07:30
  • Take the first image you present above as an example, the count in the right: 5000, 4000, 3000, 2000, 1000. How to change them to 5000, 4000, 3000, 2000, 1000, 800, 600, 400, 200 ? – LiSAAA Aug 01 '17 at 07:43
  • 1
    Have a look at either `scale_y_continous` and the breaks argument or `scale_y_log10` depending on the scale you want to apply. – David Aug 01 '17 at 07:46
2

I found some methods:

1) Color scatterplot points by density This one works good.

2) Josh O'Brien's answer This is awesome! I also want to know that how to present the relationship between exact values of density and colors...

3) Create smoothscatter like plots with ggplot2 These two are also good.

I am not good at programming so I can just find some codes provided by others on the Internet :(

LiSAAA
  • 31
  • 1
  • 4