I am working with huge dataset that has thousands of rows and multiple columns. I want to use Dplyr to find maximum values of a group in a column and mutate the result in another column in the same table. Here is an example of my data:
df <- tibble(a = rep(letters[1:3], each = 3), b = seq(0.1,0.9, length.out = 9))
I want to find maximum value in each group of column a
. I know I can present maximum value in each group in a table using this code:
df %>% group_by(a) %>% summarise_all(list(~ max(.)))
which produces:
# A tibble: 3 x 2
a b
<chr> <dbl>
1 a 0.2
2 b 0.5
3 c 0.8
But that is not what I want. I desire a table that keeps the raw data, but with an additional column c
such that each value in column b
will have corresponding value in column c
that shows maximum value in each group that each data point in column b
belongs. The desired result that I want is below:
# A tibble: 9 x 3
a b c
<chr> <dbl> <dbl>
1 a 0.1 0.3
2 a 0.2 0.3
3 a 0.3 0.3
4 b 0.4 0.6
5 b 0.5 0.6
6 b 0.6 0.6
7 c 0.7 0.9
8 c 0.8 0.9
9 c 0.9 0.9
I will be glad for help. Thanks.