0

How could this code be altered to be applied at a group level within a dataframe, whilst iterating over the same columns? Haven't been able to repoduce with dplyr.

for (i in 2:nrow(df)) {

df$sd_x[i] <- 1 / 
    
    ((1 / (df$sd_x[i-1] ^ 2)) + 
    
    (1 / (df$prior_sd[i] ^ 2))) ^ 0.5
}
     group prior_sd   sd_x
 1     A    1.14     0.808 
 2     B    1.14     0.233 
 3     C    1.14     0.136 
 4     D    1.14     0.100 
 5     A    1.14     0.659 
 6     B    1.14     0.224 
 7     C    1.14     0.132 
 8     D    1.14     0.0994
 9     A    1.14     0.571 
10     B    1.14     0.212 

When using dplyr, I don't believe you can reference the column you are creating with a lag of itself. The above loop code functions, however it is ignoring the grouping variable on the left (group), which is my dilemma.

Thanks

  • 1
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. You can probably do this more easily with `dplyr` using `group_by` and `lag()`. – MrFlick Feb 25 '22 at 05:35
  • As another user noted, it would be helpful to see what dataset you are working with. You can use this video as a guide for sharing the `dput` of your data: https://youtu.be/3EID3P1oisg – Shawn Hemelstrand Feb 25 '22 at 18:53
  • Have added a dummy dataset. Essentially the loop is working, see column `sd_x`, however I am unsure how to reference the grouping variable `group`. `dplyr` I don't believe will be useful as `mutate` won't calculate correctly a variable that is a lag of itself. – Christos Manoussakis Feb 27 '22 at 00:20

1 Answers1

0

To elaborate on the suggestion of lag by MrFlick what you want is probably something like this, but it's hard to know without a reproducible example.

df %>% 
group_by(group) %>%
    mutate(sd_x = 1/((1/lag(sd_x)^ 2) + (1/ prior_sd^2))^0.5 %>%
ungroup()
Sethzard
  • 50
  • 8