Apply a function that refers to a subset without a for loop

Question

I have a data.frame such as this one:

df=data.frame(id=c("A","A","A","B","B","B"), V=c(3,6,8,5,6,4))

I would like to divide each value of V by the sum of V over the same ID and store the result in a new column. I can reach this by using a for loop:

for (i in 1:nrow(df)) {
  df$y[[i]] <- df$V[[i]]/sum(subset(df, id == df$id[[i]])$V)
}

Which gives the expected output:

  id V         y
1  A 3 0.1764706
2  A 6 0.3529412
3  A 8 0.4705882
4  B 5 0.3333333
5  B 6 0.4000000
6  B 4 0.2666667

I would like to know if there is a more simple/efficient way to do so, using e.g. the apply family. Thanks for your help!

`df <- transform(df, y = ave(V, id, FUN = prop.table))` – Ronak Shah Apr 29 '21 at 06:51 — Ronak Shah, Apr 29 '21 at 06:51

score 0 · Answer 1 · answered Apr 29 '21 at 06:52

Probably the easiest way is using dplyr:

library(dplyr)
df=data.frame(id=c("A","A","A","B","B","B"), V=c(3,6,8,5,6,4))
df %>% 
  group_by(id) %>% 
  summarise(y = prop.table(V))
# A tibble: 6 x 2
# Groups:   id [2]
  id        y
  <chr> <dbl>
1 A     0.176
2 A     0.353
3 A     0.471
4 B     0.333
5 B     0.4  
6 B     0.267

score 0 · Accepted Answer · answered Apr 29 '21 at 07:02

No need to use either loop or apply, R is already vectorised. Use answer suggested by Ronak above. You may also use

ave(df$V, df$id, FUN = function(x) x/sum(x)) 

[1] 0.1764706 0.3529412 0.4705882 0.3333333 0.4000000 0.2666667

for a better understanding which actually works like prop.table. You may also store it in new variable.

Apply a function that refers to a subset without a for loop

2 Answers2