0

I'm looking to perform operations for one column based on grouping for another column.

Say I have the following data:

user <- c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3)
score <- c(1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1)
time_1 <- c(130, NA, 120, 245, NA, NA, NA, 841, NA, NA, 721, 612)
time_2 <- c(NA, 742, NA, NA, 812, 212, 214, NA, 919, 528, NA, NA)
df <- data.frame(user, score, time_1, time_2) 

We get the following df:

   user score time_1 time_2
    1     1    130     NA
    1     0     NA    742
    1     1    120     NA
    1     1    245     NA
    2     0     NA    812
    2     0     NA    212
    2     0     NA    214
    2     1    841     NA
    3     0     NA    919
    3     0     NA    528
    3     1    721     NA
    3     1    612     NA

For every user 1, what is the smallest value of time_1? So I am looking to group users by their number, and perform an operation on column time_1.

juanjedi
  • 140
  • 1
  • 7

1 Answers1

1

Update on OP request(see comments): Just replace summarise with mutate:

df %>% 
  group_by(user) %>% 
  mutate(Smallest_time1 = min(time_1, na.rm=TRUE))

    user score time_1 time_2 Smallest_time1
   <dbl> <dbl>  <dbl>  <dbl>          <dbl>
 1     1     1    130     NA            120
 2     1     0     NA    742            120
 3     1     1    120     NA            120
 4     1     1    245     NA            120
 5     2     0     NA    812            841
 6     2     0     NA    212            841
 7     2     0     NA    214            841
 8     2     1    841     NA            841
 9     3     0     NA    919            612
10     3     0     NA    528            612
11     3     1    721     NA            612
12     3     1    612     NA            612

We could use min() inside summarise with na.rm=TRUE argument:

library(dplyr)
df %>% 
  group_by(user) %>% 
  summarise(Smallest_time1 = min(time_1, na.rm= TRUE))
 user Smallest_time1
  <dbl>          <dbl>
1     1            120
2     2            841
3     3            612
TarJae
  • 72,363
  • 6
  • 19
  • 66
  • That works great. Is it possible to assign the values of `Smallest_time1` to the user in the original df in a new column? – juanjedi Nov 07 '21 at 09:03
  • 1
    Please see my update! – TarJae Nov 07 '21 at 09:05
  • Similarly, is it possible to compute this operation only when say `score = 1`? Edit: I used the filter() and worked well. Thanks for the help! – juanjedi Nov 07 '21 at 09:17
  • You could use `ifelse` like: `df %>% group_by(user) %>% mutate(Smallest_time1 = ifelse(score==1, min(time_1, na.rm=TRUE), time_1))` – TarJae Nov 07 '21 at 09:22