0

I am trying to use the condition statement in the pipe but failed.

The data like this:

group = rep(letters[1:3], each = 3)
status = c(T,T,T,  T,T,F,  F,F,F)
value  = c(1:9)

df = data.frame(group = group, status = status, value = value)

> df
  group status value
1     a   TRUE     1
2     a   TRUE     2
3     a   TRUE     3
4     b   TRUE     4
5     b   TRUE     5
6     b  FALSE     6
7     c  FALSE     7
8     c  FALSE     8
9     c  FALSE     9

I want to get the rows in each group that have max value with the condition that if any of the status in each group have TRUE then filter(status == T) %>% slice_max(value) or slice_max(value) otherwise.

What I have tried is this:

# way 1
df %>% 
  group_by(group) %>% 
  if(any(status) == T) {
    filter(status == T) %>% slice_max(value)
  } else {
    slice_max(value)
  }

# way 2 
df %>% 
  group_by(group) %>% 
  when(any(status) == T,
    filter(status == T) %>% slice_max(value),
    slice_max(value))

What I expected output should like this:

> expected_df
  group status value
1     a   TRUE     3
2     b   TRUE     5
3     c  FALSE     9

Any help will be highly appreciated!

zhiwei li
  • 1,635
  • 8
  • 26

3 Answers3

1

Try arranging the data by status then value, then just taking the first result

df %>% 
  group_by(group) %>% 
  arrange(!status, desc(value)) %>% 
  slice(1)

Since we arrange by status, if they have a TRUE value, it will come first, if not, then you just get the largest value. Generally it's a bit awkward to combine pipes and if statements but if that's something you want to look into, that's covered in this existing question but if statements don't work with group_by.

MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • thanks, there is a little that is the `!` and `desc()` have the same effect? – zhiwei li Apr 22 '21 at 03:46
  • In this case, kind of. The `!` flips true and falses so it only really works on logical values, but `desc()` is more general and will work with more classes. – MrFlick Apr 22 '21 at 03:48
  • Hi Flick, I've got a new problem. I thought maybe you could help me. I would appreciate it if you could take a look at my new question. https://stackoverflow.com/questions/67378023/how-to-replace-na-seperately-with-linear-model-in-r – zhiwei li May 04 '21 at 02:32
1

A bit more verbose :

library(dplyr)

df %>%
  group_by(group) %>%
  filter(if(any(status)) value ==max(value[status]) else value == max(value)) %>%
  ungroup

#  group status value
#  <chr> <lgl>  <int>
#1 a     TRUE       3
#2 b     TRUE       5
#3 c     FALSE      9
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Hi Ronak Shah, I've got a new problem. I thought maybe you could help me. I would appreciate it if you could take a look at my new question. https://stackoverflow.com/questions/67378023/how-to-replace-na-seperately-with-linear-model-in-r – zhiwei li May 04 '21 at 02:33
0
df %>% 
   group_by(group) %>%
   slice(which.max(value*(all(!status)|status)))
# A tibble: 3 x 3
# Groups:   group [3]
  group status value
  <chr> <lgl>  <int>
1 a     TRUE       3
2 b     TRUE       5
3 c     FALSE      9

Though the best is to arrange the data

Onyambu
  • 67,392
  • 3
  • 24
  • 53