Problem description
I've run a survey with a multiple select question, where the output is separated by commas in one column, and a grouping question (e.g. sex). Now I want to cross tabulate those 2 variables.
Sample data
My data comprises of 2 columns:
- A multiple select question, which the survey software outputs as one column with commas separating the selection
- A grouping variable, in this case male or female
dat <- data.frame(Multiple = c("A,B,C","B","A,C"), Sex = c("M","F","F"))
Desired output
I want to cross tabulate the multiple select options (without commas) with sex:
Multiple Sex Count
A M 1
B M 1
C M 1
A F 1
B F 1
C F 1
Attempted solution
This is a partial solution where I count the elements in the multiple select question only. My problem is that I don't know how to include the grouping variable sex into this function because I am using a regular expression to count the elements in the comma separated vector:
MSCount <- function(X){
# Function to count values in a comma separated vector
Answers <- sort(
unique(
unlist(
strsplit(
as.character(X), ",")))) # Find the possible options from the data alone, e.g. "A", "B" etc.
Answers <- Answers[-which(Answers == "")] # Drop blank answers
CountAnswers <- numeric(0) # Initialise the count as an empty numeric list
for(i in 1:length(Answers)){
CountAnswers[i] <- sum(grepl(Answers[i],X))
} # Loop round and count the rows with a match for the answer text
SummaryAnswers <- data.frame(Answers,CountAnswers,PropAnswers = 100*CountAnswers/length(X[!is.na(X)]))
return(SummaryAnswers)
}