This question is about converting the Y-values in your plot to percentages, much like this question. However, the answers no longer seem applicable since we can no longer surround our variables with "..", but have to use the stat()-function.
My plot looks like this:
It's created with the following code. The X-variable is a categorical variable (city) and the y variable counts the number of observation for each city:
ggplot(fulldata, aes(x=fct_rev(fct_infreq(CITY_LADOK)))) +geom_bar() +coord_flip()
I want to transform the y valuables to percentages, preferably without having to create a reference table. The help page for calculated aesthetics is....not particularly helpful. It doesn't state if percentages can be calculated nor how it's done. If I extrapolate from the examples though, I should be able to write something along the lines of:
ggplot(fulldata, aes(x=fct_rev(fct_infreq(CITY_LADOK))))
+geom_bar(y=stat(count/sum(count)))+coord_flip()
...In theory at least, now I get an error message claiming:
Error in sum(count) : invalid 'type' (closure) of argument
But what if I scale this down and simply use stat() to calculate the original plot?
ggplot(fulldata, aes(x=fct_rev(fct_infreq(CITY_LADOK))))
+geom_bar(y=stat(count))+coord_flip()
We get another error message
Error in rep(value[[k]], length.out = n) :
attempt to replicate an object of type 'closure'
It doesn't work with y=stat(bin) and it doesn't seem to work with y=stat(identity) either. Can the stat()-function be used at all with categorical values and if so, can it be used to calculate percentages?
Excerpt of data:
structure(list(start_date = structure(c(17776, 17776, 17776,
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776,
17776, 17776, 17776, 17776, 17776, 17776, 17776, 17776), class = "Date"),
CITY_LADOK = c("GÖTEBORG", "LILLA_EDET", "GÖTEBORG", "GÖTEBORG",
"UDDEVALLA", "SKÖVDE", "VÄSTERÅS", "TROLLHÄTTAN", "ALE",
"GÖTEBORG", "GÖTEBORG", "GÖTEBORG", "UPPSALA", "TJÖRN", "TROLLHÄTTAN",
"UDDEVALLA", "UDDEVALLA", "KUNGSBACKA", "VÄNERSBORG", "UDDEVALLA"
)), row.names = c(NA, -20L), groups = structure(list(start_date = structure(17776, class = "Date"),
.rows = list(1:20)), row.names = c(NA, -1L), class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))