3

I'd like to ask if it's possible to label each of the points plotted by stat_sum with the percentage (i.e. the proportion) of the observations that that point represents. Ideally I would like the label to be in percent format rather than decimal.

Many thanks for your time.

Edit: Minimal reproducible example

library("ggplot2")
library("scales")

ggplot(diamonds, aes(x = cut, y = clarity)) +
  stat_sum(aes(group = 1)) +
  scale_size_continuous(labels=percent)

Image of the resulting plot

So my question is, how (if possible) to label each of those summary points with their 'prop' percentage value.

Harry
  • 3,312
  • 4
  • 34
  • 39
  • Maybe you can give some (reproducible) example code? – ROLO Nov 16 '12 at 16:35
  • Apologies, I can't share the data I am using, but the plots I am generating are exactly like those in the documentation: http://docs.ggplot2.org/current/stat_sum.html. The second example plot represents what I have. I would like to be able to label each of those summary points with their prop value, albeit ideally in percentage format. I hope that helps to elaborate. – Harry Nov 16 '12 at 17:58
  • 1
    @Harry Your question has been unanswered for 16 hours, although it should be pretty easy to resolve, probably because you haven't supplied a minimal reproducible example. May I suggest you create some fake data (a small set) and the bare minimum of supporting code necessary then add them to your question. Take it from me as another newbie, this improves your chance of getting an answer. Please also see this http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – SlowLearner Nov 17 '12 at 06:24
  • 1
    @SlowLearner Thank you for the advice, I've added an example using the diamonds dataset which is built into ggplot2. – Harry Nov 17 '12 at 10:43

1 Answers1

10

There are a few options. I'll assume that the legend is not needed given that the points are labelled with percentage counts.

One option is to add another stat_sum() function that contains a label aesthetic and a "text" geom. For instance:

library("ggplot2")

ggplot(diamonds, aes(x = cut, y = clarity, group = 1)) +
  stat_sum(geom = "point", show.legend = FALSE) +
  stat_sum(aes(label = paste(round(..prop.. * 100, 2), "%", sep = "")), 
              size = 3, hjust = -0.4, geom = "text", show.legend = FALSE)

Or, there may be no need for the points. The labels can do all the work - show location and size:

ggplot(diamonds, aes(x = cut, y = clarity, group = 1)) +
   stat_sum(aes(label = paste(round(..prop.. * 100, 2), "%", sep = "")), 
              geom = "text", show.legend = FALSE) +
  scale_size(range=c(2, 8))

Sometimes it is easier to create a summary table outside ggplot:

library(plyr)
df = transform(ddply(diamonds, .(cut, clarity), "nrow"),
        percent = round(nrow/sum(nrow)*100, 2))

ggplot(df, aes(x = cut, y = clarity)) +
  geom_text(aes(size = percent, label = paste(percent, "%", sep = "")), 
                     show.legend = FALSE) +
  scale_size(range = c(2, 8))

enter image description here

Sandy Muspratt
  • 31,719
  • 12
  • 116
  • 122