3

I'd like to filter just only x1,x2, and x3 values with the distance between the 5th and 95th quantiles by groups (id). But I don't have success in combining across with my variables (x1,x2, and x3), in my example:

library(dplyr)

data <- tibble::tibble(id= paste0(rep("sample_",length(100)),rep(1:10,10)),x1 = rnorm(100),x2 = rnorm(100),x3 = rnorm(100))

data %>%
  group_by(id) %>%
  dplyr::filter(across(x1:x3, function(x) x > quantile(x, 0.05) 
                x < quantile(x, 0.95)))
#Error: Problem with `filter()` input `..1`.
#i Input `..1` is `across(...)`.
#i The error occurred in group 1: id = "sample_1".
user438383
  • 5,716
  • 8
  • 28
  • 43
Leprechault
  • 1,531
  • 12
  • 28

2 Answers2

2

Your function will run if you change the code to use & ("AND") between each condition.

data %>%
  group_by(id) %>%
  dplyr::filter(across(x1:x3, function(x) x > quantile(x, 0.05) & x < quantile(x, 0.95)))

You can also shorten the code with:

data %>%
  group_by(id) %>%
  filter(across(x1:x3, ~ .x > quantile(.x, 0.05) & .x < quantile(.x, 0.95)))

However, I think filter is intended to be used with either if_all or if_any (introduced in dplyr 1.0.4; see here), depending on whether you want all selected columns or any selected column to fulfill the condition.

For example:

data %>%
  group_by(id) %>%
  filter(if_all(x1:x3, ~ .x > quantile(.x, 0.05) & .x < quantile(.x, 0.95)))

data %>%
  group_by(id) %>%
  filter(if_any(x1:x3, ~ .x > quantile(.x, 0.05) & .x < quantile(.x, 0.95)))

In your case, if_all and across give the same results, but I'm not sure if across is guaranteed to always behave the same as if_all.

eipi10
  • 91,525
  • 24
  • 209
  • 285
1

You forgot & between the two conditions:

library(dplyr)

data <- tibble::tibble(id= paste0(rep("sample_",length(100)),rep(1:10,10)),x1 = rnorm(100),x2 = rnorm(100),x3 = rnorm(100))

data %>%
  group_by(id) %>%
  dplyr::filter(across(.cols = x1:x3, function(x) x > quantile(x, 0.05) & 
                       x < quantile(x, 0.95)))

   id            x1      x2      x3
   <chr>      <dbl>   <dbl>   <dbl>
 1 sample_2 -0.0222 -1.17   -0.634 
 2 sample_4 -0.584   0.400  -1.01  
 3 sample_8 -0.462  -0.890   0.851 
 4 sample_1  1.39   -0.0418 -1.31  
 5 sample_2 -0.446   1.61   -0.0368
 6 sample_3  0.617  -0.148  -0.358 
 7 sample_4 -1.20    0.340   0.0903
 8 sample_6 -0.538  -1.10   -0.387 
 9 sample_9 -0.680   0.195  -1.51  
10 sample_5 -0.779   0.419   0.720 
Waldi
  • 39,242
  • 6
  • 30
  • 78