I have a data frame in the following format:
Person Answer Value
John Yes 3
Pete No 6
Joan Yes 5
Joan Yes 4
Pete No 7
I want to conduct an analysis (and create a stacked bar plot), where I'm able to group by the Person (repeating) and Answer variables and then summarize by value.
I've tried using dplyr to perform this, but I'm running into issues. The values on which I'm trying to perform the function are hindered if I use a group_by clause in my dplyr piping.
e.g.,
df2 <- df %>%
select(Person, Answer, Value) %>%
group_by(Person, Answer) %>%
summarise(sum(value = 3)/length(original dataframe ungrouped) + sum(value = 6)/length(original dataframe ungrouped)
The problem I'm running into is performing this calculation properly. The calculation doesn't make sense AFTER the data has been grouped, as I end up return a very limited dataframe after grouping.
Expected output:
person answer value
Joan Yes. calculated value (summary stat)
Joan No. calculate value
John Yes. calculated value....
John No
Pete Yes
Pete No
Ultimately, I'd like to make a stacked bar chart, where the summarization is shown across the People and the bars are divided into percentages by "yes" and "no" answers. For example, there are 3 bars: one for John, one for Pete, and one for Joan, and each of these bars is divided into two parts (values based on yes/no response)
Thanks!