I know the answer to this question will be simple but I have searched the forums extensively and I have been unable to find a solution.
I have a column called Data_source
which is a factor that I want to group my variables by.
I have a series of symptom*
variables where I want the counts according to Data_source
.
For some reason, I am unable to figure out how to do this. The normal group_by
functions do not seem to work appropriately.
Here is the dataframe in question
df <- wrapr::build_frame(
"Data_source" , "Sex" , "symptoms_decLOC", "symptoms_nausea_vomitting" |
"1" , "Female", NA_character_ , NA_character_ |
"1" , "Female", NA_character_ , NA_character_ |
"1" , "Female", "No" , NA_character_ |
"1" , "Female", "Yes" , "No" |
"1" , "Female", "Yes" , "No" |
"1" , "Female", "Yes" , "No" |
"1" , "Male" , "Yes" , "No" |
"1" , "Female", "Yes" , "No" |
"2" , "Female", NA_character_ , NA_character_ |
"2" , "Male" , NA_character_ , NA_character_ |
"2" , "Male" , NA_character_ , NA_character_ |
"2" , "Female", "Yes" , "No" |
"2" , "Female", "Yes" , "No" |
"2" , "Male" , NA_character_ , NA_character_ |
"2" , "Male" , NA_character_ , NA_character_ |
"2" , "Male" , NA_character_ , NA_character_ |
"2" , "Female", NA_character_ , NA_character_ |
"2" , "Female", NA_character_ , NA_character_ |
"2" , "Male" , NA_character_ , NA_character_ |
"2" , "Female", NA_character_ , NA_character_ )
Notice that Sex and the symptoms variables are all factors which include NA's. I have attempted the following
df %>% na.omit() %>% group_by(Data_source) %>% count("symptoms_decLOC")
Which does not work and is less than optimal because I would have to repeat it for every column. The ideal would be to use something similar to lapply(df, count)
but this does not give me description for each group.
EDIT
In response to question below, I have added the expected output. I have edited this in excel, color coding the group_by
for clarity.
Notice how I am getting a break down for each possible answer. When I run this using dplyr
here is the output.
> df %>% na.omit() %>% group_by(Data_source) %>% count("symptoms_decLOC")
# A tibble: 2 x 3
# Groups: Data_source [2]
Data_source `"symptoms_decLOC"` n
<chr> <chr> <int>
1 1 symptoms_decLOC 5
2 2 symptoms_decLOC 2