4

My dataframe looks like this -

dataset = data.frame(ID=c(1:3),Count=c(22,NaN,13))

I'm trying to use the pipe operator to replace NaN with 0

dataset = dataset %>% replace('NaN',0)

However this doesn't work. I've looked at the solutions on this websites, but none seem to work.

Any inputs would be highly appreciated.

Varun
  • 1,211
  • 1
  • 14
  • 31
  • 11
    `dataset %>% mutate_at(vars(Count), ~replace(., is.nan(.), 0))` – Ronak Shah Jun 28 '19 at 11:35
  • 12
    `dataset[is.na(dataset)] <- 0` - no need for dplyr. – jay.sf Jun 28 '19 at 11:35
  • 1
    @DanM I added few more targets where `NaN` values are converted to 0. Should be fine now? – Ronak Shah Jun 28 '19 at 11:47
  • 2
    I dont think this question is a duplicate of the first and of the second links, because NAs are different from NaNs in R. For the third and fourth link, it might be a duplicate, but a specific package was asked in this question `dplyr`. However, NaNs are most often produced as a result of implossible calculations as 0/0 and sqrt or log of negative numbers. So probably best way to remove NaNs is by correcting the code before it is produced. – LuizZ Apr 12 '21 at 21:40
  • @jay.sf For this simple example, that works, but in general this will not work for a solution requiring the pipe operator like the OP wants. – Earlien Oct 25 '21 at 21:13
  • 1
    `mutate_at` has been superseded. An updated version of @RonakShah's answer is: `dataset %>% mutate(across(Count, ~ replace(., is.nan(.), 0)))`. – Earlien Dec 04 '22 at 22:46
  • 1
    `dataset |> mutate(across(everything(), ~replace_na(., 0)))` – kraggle Mar 06 '23 at 17:09

1 Answers1

5

This'll do it, and doesn't even require dplyr as it's in base:

dataset$Count[is.nan(dataset$Count)]<-0
DanM
  • 337
  • 3
  • 9
  • I tired to do this, but I am getting an error saying `Error in is.nan(df) : default method not implemented for type list`. Do you know what the issue is or how to solve it? – user20203146 Jun 13 '23 at 11:31
  • We'd have to look at your data to be sure, but I note you're naming a whole dataframe, not a variable in it. (note the example references dataset$Count, not just dataset). is.nan specifically refers to numeric variables (hence "nan" rather than "na") and therefore can't be applied to non-numeric variables. From a quick Google I'm guessing either that (df) is a variable or list rather than a dataframe, or else there is a non-numeric variable in the dataframe (df). – DanM Jun 15 '23 at 19:01