0

I need to plot over 10,000 points on a single scatter plot, and there are many points overlapping with each other. This makes the entire PDF file very large, which is undesired.

Since I am using ggplot2 for producing all the plots, I wonder if there is any function that can avoid plotting the many overlapping points, say, using hues?

I know that stat_hexbin may work in such cases, but I am thinking of some approach similar to the smoothScatter function in base-R graphics. If we could use some kind of density presented in color gradient that would be great. Thanks!

alittleboy
  • 10,616
  • 23
  • 67
  • 107
  • 1
    stat_hexbin is designed for exactly such cases. You can customize number of bins and gradient colors. Another alternative is to save as a raster format. – Señor O Apr 22 '14 at 17:24
  • 1
    you can also use jitter, and an alpha scale for coloring points. using PNG will result in a smaller file. This is related to question asked earlier, let me find that, lots of useful tips and tricks in the responses – infominer Apr 22 '14 at 17:28
  • Have a loot at the commensts in this http://stackoverflow.com/questions/22671740/how-to-save-a-pdf-in-r-with-a-lot-of-points#comment34540907_22671740 – infominer Apr 22 '14 at 17:33
  • Try `smoothScatter` also (although it's not a ggplot2 function). – Ari B. Friedman Apr 22 '14 at 17:37
  • 1
    [`stat_density2d(...)`](http://docs.ggplot2.org/0.9.3.1/stat_density2d.html) may be what you're looking for. – jlhoward Apr 22 '14 at 19:26

0 Answers0