0

I have the following table in R :

id var1 var2 value
ID1 A X 1
ID2 B X 2
ID3 C X 3
ID4 D X 4
ID5 A Y 2
ID6 C Y 5
ID7 B Y 3

And to group_by in dplyr the var1 and var2 and take the proportion of each grouping resulting to :

id var1 var2 value
ID1 A X 1/3
ID2 A Y 2/3
ID3 C X 3/8
ID4 C Y 5/8
ID5 B X 2/5
ID6 B Y 3/5
ID7 D X 1

I tried :

id = c("ID1","ID2","ID3","ID4","ID5","ID6","ID7")
var1 = c("A","B","C","D","A","C","B")
var2 = c(rep("X",4),rep("Y",3))
value = c(1,2,3,4,2,5,3)
data = data.frame(id,var1,var2,value);data
library(dplyr)
data%>%
  group_by(var1,var2)%>%
  summarise(prop = sum(value))

But it only group the var1 and var2. Any help ?

Homer Jay Simpson
  • 1,043
  • 6
  • 19
  • 1
    From the output it looks like you only want to group by `var1`. Try `data %>% group_by(var1) %>% mutate(value = prop.table(value))` – Ronak Shah Sep 14 '21 at 06:33

2 Answers2

1

This may works

library(dplyr)

data %>%
  group_by(var1)%>%
  mutate(value = value/sum(value)) %>%
  arrange(var1, var2)

  id    var1  var2  value
  <chr> <chr> <chr> <dbl>
1 ID1   A     X     0.333
2 ID5   A     Y     0.667
3 ID2   B     X     0.4  
4 ID7   B     Y     0.6  
5 ID3   C     X     0.375
6 ID6   C     Y     0.625
7 ID4   D     X     1 
Park
  • 14,771
  • 6
  • 10
  • 29
0

data.table

library(data.table)
setDT(df)[, res := proportions(value), by = var1][order(var1)]

base

df$res <- ave(df$value, list(df$var1), FUN = proportions)

Yuriy Saraykin
  • 8,390
  • 1
  • 7
  • 14