How to get dplyr `pull()` to recognize grouping in R?

Question

library(tidyverse)
df <- tibble(col1 = c("a", "a", "b", "b"),
             col2 = c(2, NA, 10, 8))
#> # A tibble: 4 x 2
#>   col1   col2
#>   <chr> <dbl>
#> 1 a         2
#> 2 a        NA
#> 3 b        10
#> 4 b         8

I've got the data frame above that I'd like to perform the following logic on:

Group by col1
With this col1 grouping determine the largest col2 value
Populate this largest col2 value as the col3 value, for said grouping

What you'd end up with is the data frame below.

#> # A tibble: 4 x 3
#>   col1   col2  col3
#>   <chr> <dbl> <dbl>
#> 1 a         2     2
#> 2 a        NA     2
#> 3 b        10    10
#> 4 b         8    10

My attempt at the code is below, and I understand it doesn't work because my dplyr::pull() isn't written (by me) in a way that it got the grouping logic I intend. How do I get dplyr::pull() to recognize the grouping I intend, or perhaps there's a better approach to solve my problem.

df %>% 
  group_by(col1) %>% 
  mutate(col3 = top_n(., 1, col2) %>% pull(col2))
#> # A tibble: 4 x 3
#> # Groups:   col1 [2]
#>   col1   col2  col3
#>   <chr> <dbl> <dbl>
#> 1 a         2     2
#> 2 a        NA    10
#> 3 b        10     2
#> 4 b         8    10

I think you're getting messed up by trying to use `dplyr` functions in places where base ones would suffice. `top_n` returns a data frame—if all you need is the 1 largest value of `col2`, why not just `max(col2)`? `pull` is the same as just `$` or `[[` and also not needed here — camille, Oct 28 '19 at 14:09

score 2 · Accepted Answer · answered Oct 28 '19 at 14:08

You're almost close. The function to use is max which pulls the maximum value after removing the NAs

df %>% 
group_by(col1) %>%
 mutate(col3 = max(col2, na.rm = TRUE))

# A tibble: 4 x 3
# Groups:   col1 [2]
#  col1   col2  col3
#  <chr> <dbl> <dbl>
#1 a         2     2
#2 a        NA     2
#3 b        10    10
#4 b         8    10

How to get dplyr `pull()` to recognize grouping in R?

1 Answers1