0

I have a variable which can be a, b or c from different persons (A, B, C) and different years (1, 2, 3). It looks like this:

enter image description here

I want to calculate the percentages per colour field. For example: the three yellow rows together are 100%. A-2-a is than 13%, A-2-b is 48% and A-2-c is 39%. And the three green rows are together 100% as well.

I tried the follow, but it does not work. Can anybody help me?

  group_by(Person, Year, Variable) %>%
  dplyr::mutate(Count = n()) %>%
  group_by(Person, Year) %>%
  dplyr::mutate(Percentage = formattable::percent(count / sum(count), digits=1)) %>%
  dplyr::select(Year, Variable, Count, Percentage) %>%
  distinct()
Yvonne
  • 21
  • 3
  • 1
    Note in your first `mutate` you have an uppercase `Count` but in the second `mutate` you have lowercase `count`. Also, it'd good to always include the error message if your code is not working, and of course, always include some sample data using `dput` instead of posting data in the form of an image. – benson23 May 24 '23 at 13:01
  • 1
    Does this answer your question? [Relative frequencies / proportions with dplyr](https://stackoverflow.com/questions/24576515/relative-frequencies-proportions-with-dplyr) – I_O May 24 '23 at 13:09
  • I am sorry benson23, I changed the variables and other things because of privacy of the data. Count and count were not the problem. The problem is the group were about R calculate the percentage. R does not understand right which rows are grouped together. – Yvonne May 24 '23 at 13:13
  • 1
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Please [do not post code or data in images](https://meta.stackoverflow.com/q/285551/2372064) – MrFlick May 24 '23 at 13:15

1 Answers1

0
library(tidyverse)

df <- tibble(
  person = rep("A", 9),
  year = rep(c(1, 2, 3), each = 3),
  variable = rep(c("a", "b", "c"), 3),
  count = c(23, 30, 35, 6, 22, 18, 24, 12, 36)
)

df %>% 
  group_by(person, year) %>%  
  mutate(percentage = count / sum(count))

# A tibble: 9 × 5
# Groups:   person, year [3]
  person  year variable count percentage
  <chr>  <dbl> <chr>    <dbl>      <dbl>
1 A          1 a           23      0.261
2 A          1 b           30      0.341
3 A          1 c           35      0.398
4 A          2 a            6      0.130
5 A          2 b           22      0.478
6 A          2 c           18      0.391
7 A          3 a           24      0.333
8 A          3 b           12      0.167
9 A          3 c           36      0.5  
Chamkrai
  • 5,912
  • 1
  • 4
  • 14