How to replace NaN value with zero in a huge data frame?

Question

I tried to replace NaN values with zeros using the following script:

rapply( data123, f=function(x) ifelse(is.nan(x),0,x), how="replace" )
# [31]   0.00000000  -0.67994832   0.50287454   0.63979527   1.48410571  -2.90402836

The NaN value was showing to be zero but when I typed in the name of the data frame and tried to review it, the value was still remaining NaN.

data123$contri_us
# [31]          NaN  -0.67994832   0.50287454   0.63979527   1.48410571  -2.90402836

I am not sure whether the rapply command was actually applying the adjustment in the data frame, or just replaced the value as per shown.

Any idea how to actually change the NaN value to zero?

I tried but R gave the the following error message: > Error in is.nan(data123) : default method not implemented for type 'list' — cactussss, Aug 09 '13 at 08:39
Hong Ooi's solution works with matrices, but, sadly, not data frames. — ben, Feb 02 '23 at 23:58

Hong Ooi · Answer 1 · 2014-03-13T18:58:50.150

127

It would seem that is.nan doesn't actually have a method for data frames, unlike is.na. So, let's fix that!

is.nan.data.frame <- function(x)
do.call(cbind, lapply(x, is.nan))

data123[is.nan(data123)] <- 0

edited Mar 13 '14 at 18:58

answered Aug 09 '13 at 08:46

Hong Ooi

56,353
13
134
187

8

Your bottom function should be "is.nan.data.frame". – Concerned_Citizen Jan 29 '14 at 05:27
35

@Dombey That isn't required; by the magic of method dispatch, `is.nan.data.frame` will be called automatically. – Hong Ooi Mar 13 '14 at 18:59
Can someone explain why the first two lines are written, logically only the third line is required right? – user20203146 Jun 13 '23 at 11:58

score 43 · Answer 2 · edited Feb 02 '16 at 20:43

43

In fact, in R, this operation is very easy:

If the matrix 'a' contains some NaN, you just need to use the following code to replace it by 0:

a <- matrix(c(1, NaN, 2, NaN), ncol=2, nrow=2)
a[is.nan(a)] <- 0
a

If the data frame 'b' contains some NaN, you just need to use the following code to replace it by 0:

#for a data.frame: 
b <- data.frame(c1=c(1, NaN, 2), c2=c(NaN, 2, 7))
b[is.na(b)] <- 0
b

Note the difference is.nan when it's a matrix vs. is.na when it's a data frame.

Doing

#...
b[is.nan(b)] <- 0
#...

yields: Error in is.nan(b) : default method not implemented for type 'list' because b is a data frame.

Note: Edited for small but confusing typos

edited Feb 02 '16 at 20:43

Mekki MacAulay

1,727
2
12
23

answered Jul 02 '15 at 13:55

leDjeg

503
4
2

13

This explanation is _wrong_. A NA is not the data frame equivalent of a NaN. – Hong Ooi Nov 08 '18 at 12:43
Wrong answer. Agreed. – ABCD Jan 10 '19 at 06:59
This answer is applicable when you are dealing only with numbers and NaN, or if you want to treat NA as NaN, because is.na(NaN) == TRUE. – Roman Zenka Apr 01 '21 at 13:43

score 29 · Answer 3 · answered Aug 09 '13 at 07:45

29

The following should do what you want:

x <- data.frame(X1=sample(c(1:3,NaN), 200, replace=TRUE), X2=sample(c(4:6,NaN), 200, replace=TRUE))
head(x)
x <- replace(x, is.na(x), 0)
head(x)

answered Aug 09 '13 at 07:45

Marc in the box

11,769
4
47
97

atsyplenkov · Answer 4 · 2023-02-03T16:27:46.980

Here is a tidyverse solution. I've generated sample data with both NaN and NA. The first column is fully complete.

df <- tibble(x = LETTERS[1:5],
             y = c(1:3, NaN, 4),
             z = c(rep(NaN, 3), NA, 5))

df

# A tibble: 5 x 3
  x         y     z
  <chr> <dbl> <dbl>
1 A         1   NaN
2 B         2   NaN
3 C         3   NaN
4 D       NaN    NA
5 E         4     5

Then we can apply mutate_all with replace to the dataframe:

df %>% 
   mutate_all(~replace(., is.nan(.), 0))

# A tibble: 5 x 3
  x         y     z
  <chr> <dbl> <dbl>
1 A         1     0
2 B         2     0
3 C         3     0
4 D         0    NA 
5 E         4     5

We've replaced NaN values with zero and touched neither NA values nor the x column.

UPDATE to dplyr 1.0.0

Since the mutate_all is deprecated we can now rewrite the expression using across() like following:

df %>% 
  mutate(across(everything(), ~replace(.x, is.nan(.x), 0)))

# A tibble: 5 × 3
  x         y     z
  <chr> <dbl> <dbl>
1 A         1     0
2 B         2     0
3 C         3     0
4 D         0    NA
5 E         4     5

How to replace NaN value with zero in a huge data frame?

4 Answers4

Linked

Related