I am learning R through Wickham's book and there is something that I do not completely understand. Here the code:
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, y = after_stat(prop)))
So what I think happens here is that geom_bar
groups by x (=cut), i.e. creates the 'levels', and the in-built stat_count in the geom_bar counts the number of elements in each level of x. In order to get the proportion, we have to use prop. We do this with after_stat because of the in-built stat_count in geom_bar. after_stat(prop), however, takes the number that stat_count outputs for each level and divides it by itself (and NOT the sum of cuts of ALL levels). As a result, we just get bars with height 1. So far so good.
The apparent solution to the problem is this:
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, y = stat(prop), group = 1))
With this code, I get the correct heights, meaning that most likely, each level x is divided by the sum of cuts in all levels, and not just by the number of elements in the level itself.
Now, I have seen this post. However, it doesn't explain what EXACTLY happens here in chronological order with the new group = 1?
What does R exactly do different now, and in which step?
Thanks for explaining.