3

I've got a dataframe with numeric values between 0 and 1. I'd like to use if_any to do numeric value filtering:

df <- data %>%
filter(if_any(everything(), . > 0.5)

This spits out an error instead:

Error: Problem with `filter()` input `..1`.
ℹ Input `..1` is `if_any(everything(), . < 0.5)`.
x Problem with `across()` input `.fns`.
ℹ `.fns` must be NULL, a function, a formula, or a list of functions/formulas.

Does anyone have a working way of doing this filtering in a tidy way?

Thanks!

Christopher Penn
  • 539
  • 4
  • 14

3 Answers3

3

You're really close. The erorr message you've gotten is because you forgot the tilde ~:

df <- data %>%
        filter(if_any(everything(), ~ . > 0.5)

I might suggest adding an additional column selection where you only apply your criteria to numeric columns (otherwise you will get an error if you have character or factor variables in your data frame):

df <- data %>%
        filter(if_any(where(is.numeric), ~ . > 0.5)
LMc
  • 12,577
  • 3
  • 31
  • 43
  • 1
    Thank you, that is super helpful. How would one pronounce that tilde if reading the line of code out loud? I think it would (might?) help me to remember that! – George D Girton Nov 22 '22 at 19:20
  • 1
    @GeorgeDGirton I am not sure about reading it out loud. Generally, you should be aware that you are trying to apply a function across your columns, where the column is the function input. The `~` is a unique way of creating an anonymous function in the `tidyverse`. It is short hand for doing something like `function(x) x > 0.5`. Here is [more information](https://stackoverflow.com/questions/67643144/where-is-the-purrr-operator-documented) on the `~`. – LMc Nov 22 '22 at 20:11
  • Do you need 'if_any' with 'where'? I find that when I remove 'if_any' in a similar line of code I get results I expect. – mkrasmus Mar 23 '23 at 23:37
0

You can use filter_if:

data(iris)
iris %>% filter_if(is.numeric, ~ .x < .5)

This will filter all the numeric column of your dataset according to the condition you define, here < .5

0

We may also use rowSums in filter

library(dplyr)
data %>%
    filter(rowSums(. > 0.5) > 0)
akrun
  • 874,273
  • 37
  • 540
  • 662