There are multiple questions (here for instance) on how to arrange the x axis by frequency in a bar chart with ggplot2
. However, my aim is to arrange the categories on the X-axis in a stacked bar chart by the relative frequency of a subset of the fill. For instance, I would like to sort the x-axis by the percentage of category B
in variable z
.
This was my first try using only ggplot2
library(ggplot2)
library(tibble)
library(scales)
factor1 <- as.factor(c("ABC", "CDA", "XYZ", "YRO"))
factor2 <- as.factor(c("A", "B"))
set.seed(43)
data <- tibble(x = sample(factor1, 1000, replace = TRUE),
z = sample(factor2, 1000, replace = TRUE))
ggplot(data = data, aes(x = x, fill = z, order = z)) +
geom_bar(position = "fill") +
scale_y_continuous(labels = percent)
When that didn't work I created a summarised data frame using dplyr and then spread the data and sort it by B
and then gather it again. But plotting that didn't work either.
library(dplyr)
library(tidyr)
data %>%
group_by(x, z) %>%
count() %>%
spread(z, n) %>%
arrange(-B) %>%
gather(z, n, -x) %>%
ggplot(aes(x = reorder(x, n), y = n, fill = z)) +
geom_bar(stat = "identity", position = "fill") +
scale_y_continuous(labels = percent)
I would prefer a solution with ggplot only in order not to be dependent of the order in the data frame created by dplyr/tidyr. However, I'm open for anything.