0

I'm struggling with multiple response questions in R. I'm hoping to find an easy way to tackle this with dplyr and tidyr. Below is a sample multiple respose data frame. I'm trying to do things,first, create percentages - % of cats,% of dogs, etc. Percentages will be of overall responses. My usual of calculating percentages -

group_by(_)%>%summarise(count=n())%>%mutate(percent=count/sum(count)) 

doesn't seem to cut it in this situation. Maybe I have to use summarise_each or a more specialized function? I'm still new to r and really new to Dplyr and Tidyr. I also tried to use Tidyr's "unite" function, which works, but it includes NA's, which I will have to recode away. But I still can't seem to calculate the percentages of the united column.

Any suggestions would be great! First, how to unite the multiple response columns using "unite" into all possible combinations and then calculating percentages of each, and also how to simply calculate the percentage of each binary column as a proportion of overall responses? Hope this makes sense! I'm sure there's a simple and elegant answer that I'm overlooking.

Cats<-c(Cat,NA,Cat,NA,NA,NA,Cat,NA)

Dogs<-c(NA,NA,Dog,Dog,NA,Dog,NA,Dog)

Fish<-c(NA,NA,Fish,NA,NA,NA,Fish,Fish)

Pets<-data.frame(Cats,Dogs,Fish)

Pets<-Pets%>%unite(Combined,Cats,Dogs,Fish,sep=",",remove=FALSE)

 Animals%>%group_by(Combined)%>%summarise(count=n())%>%mutate(percent=count/sum(count))
Frank
  • 66,179
  • 8
  • 96
  • 180
Mike
  • 2,017
  • 6
  • 26
  • 53
  • 1
    please share expected output and data that can be read into R. – mtoto Mar 18 '16 at 16:06
  • Thanks for commenting. I'm still relatively new to this site, what do you mean by share expected output? – Mike Mar 18 '16 at 17:34
  • what the data will look like after your desired transformation you describe in your post. – mtoto Mar 18 '16 at 21:11
  • read [this](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – mtoto Mar 19 '16 at 12:24

1 Answers1

0

Sounds like what you're trying to do can be done by 'gather()' function from tidyr instead of 'unite()' function, based on my understanding of your question.

library(dplyr)
library(tidyr)

Pets %>% 
  gather(animal, type, na.rm = TRUE) %>% 
  group_by(animal) %>% 
  summarize(count = n()) %>% 
  mutate(percentage = count / sum(count))
Kan Nishida
  • 106
  • 5