0

I have a dataframe and I want to remove rows with Inf values present in a selected column. I'm looking for a solution using tidyverse as I want to include it into my tidyverse pipes.

Example:

df <- data.frame(a = c(1, 2, 3, NA), b = c(5, Inf, 8, 8), c = c(9, 10, Inf, 11), d = c('a', 'b', 'c', 'd'))

I want to remove rows having Inf values in column c. The result would be:

df2 <- data.frame(a = c(1, 2, NA), b = c(5, Inf, 8), c = c(9, 10, 11), d = c('a', 'b', 'd'))

I wish there was a function something like drop_inf(), similar to drop_na().

EDIT: The column name is passed as a variable.

Cettt
  • 11,460
  • 7
  • 35
  • 58
Ilona
  • 446
  • 4
  • 12

1 Answers1

1

You can use is.finite

df %>%
  filter(is.finite(c))

   a   b  c d
1  1   5  9 a
2  2 Inf 10 b
3 NA   8 11 d

If you want to have a dynamic column, you can use {{ }}:

my_fun <- function(df, filter_col){
  df %>%
  filter(is.finite({{filter_col}}))
}

my_fun(df, b)
my_fun(df, c)

This way you can work dynamically with my_fun like with other dplyr verbs. For example:

df %>% select(a:c) %>% my_fun(c)

See also this question.

Cettt
  • 11,460
  • 7
  • 35
  • 58
  • Sorry, I forgot to mention that the column name is passed as a variable. – Ilona Feb 03 '21 at 11:47
  • Sounds great and reasonable. But for some reason it doesn't work for me :( ```variable <- 'CPC'``` In the data there is a column CPC. ```model_per_day <- function (account, campaign_type, variable, forecast_period) { data_per_day <- data %>% filter(!is.infinite({{variable}})) )``` It doesn't throw any error but also doesn't remove the Inf's. – Ilona Feb 03 '21 at 13:12
  • does the code work for the example you have provided? You definitely have to provide the data you are using (i.e. `data`) as the first input to your function. – Cettt Feb 03 '21 at 13:15
  • library('dplyr') df <- data.frame(a = c(1, 2, 3, NA), b = c(5, Inf, 8, 8), CPC = c(9, 10, Inf, 11), d = c('a', 'b', 'c', 'd')) filter_col <- 'CPC' filter_infs <- function(df, filter_col){ df %>% filter(is.finite({{filter_col}})) } # This works - but it has the column name passed, which I don't want df %>% filter_infs(CPC) # And this doesn't work - while it has the column name passed as a variable. It filters out all the rows. df %>% filter_infs(filter_col) – Ilona Feb 03 '21 at 13:26
  • Sorry that it is all pasted in one line.. I don't know how to make Enters here... – Ilona Feb 03 '21 at 13:26
  • it does not work because you are writing `filter_infs(df, 'CPC')`. You should write of `filter_infs(df, CPC)` instead (do not use `''`). – Cettt Feb 03 '21 at 13:51
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/228212/discussion-between-ilona-and-cettt). – Ilona Feb 03 '21 at 14:14