3

I am using purrr and ggplot2 to create multiple plots at once. For each facet's name, I want to keep the group's name, but I also want to add the number of participants in each sub-group. For instance, "Manager (N = 200)" and "Employee (N = 3000)". However, when I try to add this labeler argument:

    facet_grid(~.data[[group]],
               labeller = paste0(~.data[[group]], "(N = ", group_n$n, ")"))

I get this error:

Error in cbind(labels = list(), list(`{`, if (!is.null(.rows) || !is.null(.cols)) { : 
  number of rows of matrices must match (see arg 2)

Below is a reproducible example with a simplified dataset. My goal is to have sub-group and their sample size in their facet's title.

library(purrr)
library(dplyr)
library(ggplot2)

#Data
test <- tibble(s1 = c("Agree", "Neutral", "Strongly disagree"),
               s2rl = c("Agree", "Neutral", NA),
               f1 = c("Strongly agree", NA, "Strongly disagree"),
               f2rl = c(NA, "Disagree", "Strongly disagree"),
               level = c("Manager", "Employee", "Employee"),
               location = c("USA", "USA", "AUS"))

#Get just test items for name
test_items <- test %>%
  dplyr::select(s1, s2rl, f1, f2rl)

#titles of plots for R to iterate over
titles <- c("S1 results", "Results for S2RL", "Fiscal Results for F1", "Financial Status of F2RL")


#group levels
group_name <- c("level", "location")

#Custom function to make plots

facet_plots = function(variable, group, title) {
  total_n <- test %>%
    summarize(n = sum(!is.na(.data[[variable]])))
  
  
  group_n <- test %>%
    group_by(.data[[group]], .data[[variable]]) %>%
    summarize(n = sum(!is.na(.data[[variable]])))
  
  
  plot2 <- test %>%
    count(.data[[group]], .data[[variable]]) %>%
    mutate(percent = 100*(n / group_n$n)) %>%
    drop_na() %>%
    ggplot(aes(x = .data[[variable]], y = percent, fill = .data[[variable]])) + 
    geom_bar(stat = "identity") +
    geom_text(aes(label= paste0(percent, "%"), fontface = "bold", family = "Arial", size=14), vjust= 0, hjust = -.5) +
    ylab("\nPercentage") +
    labs(
      title = title,
      subtitle = paste0("(N = ", total_n$n)) +
    coord_flip() +
    theme_minimal() +
    ylim(0, 100) +
    facet_grid(~.data[[group]],
               labeller = paste0(~.data[[group]], "(N = ", group_n$n, ")")) #issue is likely here
  
  output <- list(plot2)
  return(output)
}


#pmap call
my_plots <- expand_grid(tibble(item = names(test_items), title=titles),
                        group = group_name) %>%
  pmap(function(item, group, title)
    facet_plots(item, group, title))

my_plots

Edit: I've also tried the solution detailed here, and I receive the same error.

J.Sabree
  • 2,280
  • 19
  • 48

1 Answers1

1

The following will allow you to make plots that plot the percentage of group with characteristic variable while plotting the results with the group name and the count.

library(tidyr)
library(dplyr)
library(ggplot2)
library(purrr)
facet_plots <- function(variable, group, title="Title", dat) {
    
    variable <- sym(variable)
    group <- sym(group)
    sumdat <- dat %>%
        filter(!is.na(!!variable)) %>%
        group_by(!!group) %>%
        add_count() %>%
        mutate(lbl = paste0(!!group, " (N = ", n, ")")) %>%
        group_by(!!group, !!variable) %>%
        mutate(pct = 100 * n() / n) %>%
        slice(1L) %>%
        ungroup() %>%
        select(!!variable, !!group, n, pct, lbl)

    ggplot(sumdat, aes(x = !!variable, y = pct, group = !!group)) +
        geom_bar(stat = "identity") +
        labs(
            title = title
        ) +
        facet_grid(~lbl)

}

## Using starwars data
expand_grid(
    tibble(
        variable = c("hair_color", "skin_color", "birth_year"),
        title = c("Hair color", "Skin color", "Birth year")
    ),
    group = c("sex", "gender")) %>%
    mutate(title = paste(title, "by", group)) %>%
    pmap(facet_plots, dat = starwars)

Using pmap() will create a data frame of combinations stored as strings. Therefore, the parameters to the facet_plots() function will be strings. The first two lines turns the strings variable and group into symbols that R can use without quotes (read more here for what this means). The "bang-bang operator" !! tells R that you want the value stored in the variable, not the name itself (see help("!!")). Anytime R sees !!variable, it understands the value to be the name of the variable in the dataframe stored in the parameter variable.

Below, I show that this works with the OP's original data, not just the starwars example data.

## Using OP's data
test <- tibble(s1 = c("Agree", "Neutral", "Strongly disagree"),
               s2rl = c("Agree", "Neutral", NA),
               f1 = c("Strongly agree", NA, "Strongly disagree"),
               f2rl = c(NA, "Disagree", "Strongly disagree"),
               level = c("Manager", "Employee", "Employee"),
               location = c("USA", "USA", "AUS"))

expand_grid(
    tibble(
        variable = c("s1", "s2rl", "f1", "f2rl"),
        title = c("S1 results", "Results for S2RL", 
                  "Fiscal Results for F1", "Financial Status of F2RL")
    ),
    group = c("level", "location")
) %>%
    mutate(title = paste(title, "by", group)) %>%
    pmap(facet_plots, dat = test)

I believe that reason that your labeller didn't work is because you were passing it incorrect types. The labeller() function takes parameters in the form of var = fxn where var is a variable name in the facet grid fxn is a function for how to transform the name. You passed it data and then a function that called a separate vector.

mikebader
  • 1,075
  • 3
  • 12
  • thanks for responding! What pmap is doing is iterating overall combinations of all my variables by group (e.g., S1 by location, S1 by employee, S2rl by location, etc...), or in the starwars dataset, it'd be the equivalent of getting a plot for haircolor by sex, eyecolor by sex, etc. Something with the iteration is confusing R, and I come into the same issue with yours when I don't do just one specific variable/group. – J.Sabree Aug 26 '21 at 21:18
  • 1
    Ah, I see. I think that I fixed it so that you will be able to use `pmap` and included an example. I used the `starwars` data because I was having a hard time understanding exactly what you wanted from your data. But, I think that I changed it so that you can use whatever data you prefer via the `dat` parameter of `facet_plots`. – mikebader Aug 26 '21 at 22:03
  • thank you that does work! It says I can't award the bounty for another 18 hours, but I will then. A few questions if you don't mind: what does the {{ }} and !! mean? Also, can you explain what slice(1L) and sym are doing here? Thank you again for your help! – J.Sabree Aug 26 '21 at 22:23
  • 1
    I expanded the explanation to try to provide a better context to help make sense of why it works. The embrace operator `{{}}` and is explained pretty well here: https://stackoverflow.com/a/62792494/12586249. The bang-bang operator `!!` does the same thing, so I changed the answer to use that throughout. The `slice(1L)` function takes only the first observation of each variable-group combo (the `1L` is a way to represent the *integer* 1) since the same `n` and `pct` values repeat for all observations in the variable-group combo. We only need one, so we take the first. Hope that helps! – mikebader Aug 27 '21 at 12:31