1

Problem description

I've run a survey with a multiple select question, where the output is separated by commas in one column, and a grouping question (e.g. sex). Now I want to cross tabulate those 2 variables.

Sample data

My data comprises of 2 columns:

  1. A multiple select question, which the survey software outputs as one column with commas separating the selection
  2. A grouping variable, in this case male or female

dat <- data.frame(Multiple = c("A,B,C","B","A,C"), Sex = c("M","F","F"))

Desired output

I want to cross tabulate the multiple select options (without commas) with sex:

Multiple Sex Count
A        M   1
B        M   1
C        M   1
A        F   1
B        F   1
C        F   1

Attempted solution

This is a partial solution where I count the elements in the multiple select question only. My problem is that I don't know how to include the grouping variable sex into this function because I am using a regular expression to count the elements in the comma separated vector:

MSCount <- function(X){

# Function to count values in a comma separated vector

    Answers <- sort(
    unique(
      unlist(
        strsplit(
          as.character(X), ",")))) # Find the possible options from the data alone, e.g. "A", "B" etc.

  Answers <- Answers[-which(Answers == "")] # Drop blank answers

  CountAnswers <- numeric(0) # Initialise the count as an empty numeric list

  for(i in 1:length(Answers)){
    CountAnswers[i] <- sum(grepl(Answers[i],X)) 
  } # Loop round and count the rows with a match for the answer text

  SummaryAnswers <- data.frame(Answers,CountAnswers,PropAnswers = 100*CountAnswers/length(X[!is.na(X)]))
  return(SummaryAnswers)

}

1 Answers1

0

We can use separate_rows

library(tidyverse)
separate_rows(dat, Multiple) %>% 
                   mutate(Count = 1) %>%
                   arrange(Sex, Multiple) %>%
                   select(Multiple, Sex, Count)
akrun
  • 874,273
  • 37
  • 540
  • 662