0

I would like to display the counts of a survey in a barplot, binned by country and stacked by gender.

What I have been able to do so far is to convert the answers into a table using the table() function and plotting these binned by country sorted according to frequency. However, I am unable to stack the counts by gender in a way that sorts the table according to the number of my observations per country.

I fail to create a MWE, so instead I will post the table as far as I have gotten so far:

           A    B    C      D     E
  Female   35   7    30     9    11
  Male     30   6     9     7     3
  Other     0   0     1     1     0

When I input this table into a barplot function, it won't sort the bar plot according to observations in each country (columns). When I use the sort function, it converts the table into a vector. The output I am hoping for looks as follows:

           A    B    C      D     E
  Female   35   30   9     11    7
  Male     30   9     7     3     6
  Other     0   0     1     0     1

So that ultimately the bar plot is ordered by the sum of the country counts and then by gender.

Other things I haved tried so far: Converting the table to a matrix and then using this tutorial here on how to sort matrices. Sorting the table this way, also converts it to a vector.

divibisan
  • 11,659
  • 11
  • 40
  • 58
Tea Tree
  • 882
  • 11
  • 26
  • Or if you can use `ggplot2` (the better solution by far): https://stackoverflow.com/questions/5208679/order-bars-in-ggplot2-bar-graph – divibisan Feb 15 '19 at 21:12
  • I guess I should have added that I would like to do this using the base package. – Tea Tree Feb 15 '19 at 23:55
  • Oh, darn it. You are right, the above link by G. Grothendieck, divibisan, and Rui Barradas does explain it pretty well. I honestly googled this for a long time. Why didn't I come across this? – Tea Tree Feb 16 '19 at 00:01

1 Answers1

1

It's a little difficult to understand what exactly you want, but my understanding is that you want a barplot stacked by gender, ordered by the total height of each bar (i.e. the number of survey participants from each country). If that is correct, here is a possible solution:

library(ggplot2)
library(dplyr)

# Fake survey data
df <- data.frame(
  country = c(rep("US", 50), rep("UK", 20), rep("CHN", 30)),
  gender = sample(x = c("Female", "Male", "Other"),
                  prob = c(0.49, 0.49, 0.02),
                  size = 100, replace = TRUE)
)
table(df$gender, df$country)
##          CHN UK US
##   Female  12  8 25
##   Male    17 10 25
##   Other    1  2  0

df %>% 
# count the number of survey participants per country, per gender
  count(country, gender, sort = TRUE) %>% 
# Reorder the levels of the factor variable according to the number of survey participants in each country (because the barplot x axis order is determined by the order of the factor levels, which is alphabetical by default)
  mutate(country = forcats::fct_reorder(.f = country, .x = n, .desc = TRUE)) %>% 
# create barplot
  ggplot(aes(x = country, y = n, fill = gender)) +
  geom_col()

enter image description here

ladylala
  • 223
  • 1
  • 6