1

I am generating ECDF plots using the stat_ecdf command from ggplot2. I would like to be able to specify colours for points in the ecdf with different categories.

We'll use the iris dataset as a toy example. When I attempt to use aes(col=Species) in stat_ecdf the plot is split into the three separate ecdfs rather than simply colouring the individual points. For example:

library(ggplot2)

ggplot(iris) +
  stat_ecdf(aes(x = Sepal.Length,
                col = Species),
            geom = "point")

three ecdfs

I managed to create my desired output using both the ecdf and geom_point like so:

ggplot(data = data.frame(
  value = iris$Sepal.Length,
  ecdf = ecdf(iris$Sepal.Length)(iris$Sepal.Length),
  spec = iris$Species
)) +
  geom_point(aes(x = value, y = ecdf, col = spec))

one ecdf

My question is: is it possible to produce the second graph using the stat_ecdf function?

Michael Bird
  • 771
  • 8
  • 21
  • This goes against ggplot2's theoretical background. I don't think you can do better than `ggplot(iris) + geom_point(aes(x = Sepal.Length, y = ecdf(Sepal.Length)(Sepal.Length), color = Species))`, which is basically what you have. – Roland Aug 24 '17 at 08:06
  • Hi @Roland, thanks for your comment, do you have a link to any more information about why what I want to do goes against ggplot2's theoretical background? – Michael Bird Aug 24 '17 at 09:06
  • Read Hadley's book if you want to learn more about The Grammar of Graphics. – Roland Aug 24 '17 at 09:08
  • thanks, I'll have a look. – Michael Bird Aug 24 '17 at 09:13
  • While it's not entirely a duplicate, this one also helps to address something similar: https://stackoverflow.com/questions/18379933/plotting-cumulative-counts-in-ggplot2 – chrimaho Aug 23 '20 at 22:23

0 Answers0