How to replace only NA data with 0 in R and not the NaN value in a dataframe?

Question

My question is that how to avoid NaN value to be changed as well when you want to change NA Value :

I used this code but it'll change both :

dataframe[is.na(dataframe)] <- 0

and I saw this link and get that the reason why this happen is :

 is.na(NaN)
 [1] TRUE

but I really want to make a different between these two so I used this to change the NaN and then use the previous code

dataframe[is.nan(dataframe)] <-"NAN"

I get this error:

default method not implemented for type 'list'

so how should I do this ?

thank you in advance,

thank you @RichardScriven, yes you are right that could be done also to replace 0 where this statement is false, sorry for not to think it through first !! — f.a, Nov 18 '14 at 07:12

score 2 · Answer 1 · answered Nov 18 '14 at 07:16

2

I present this as an alternative approach to @akrun's answer.

The reason you're getting the error is because is.na has a method for handling data.frames (base:::is.na.data.frame) but is.nan does not. (Note that a data.frame is really just a list where each named element is the same length as the others, i.e., each column.)

One problem you'll run into is that as you assign a character string to the NaN values in your (otherwise numeric?) data.frame, you'll be converting the remainder of each column to character. What you should be focusing on is finding the logic to include NA and exclude NaN.

There are several ways you can handle it. If all of the columns are to be processed, then the following might work for you, using @akrun's data:

as.data.frame(lapply(dat, function(x) {
    x[is.na(x) & !is.nan(x)] <- 0
    x
}))
##     V1  V2  V3 V4  V5
## 1    0   6   0  9   5
## 2  NaN   9   2 10   6
## 3    4 NaN NaN  5   1
## 4   10   4 NaN  9 NaN
## 5    8   6   1  1   6
## 6    7 NaN   7 10 NaN
## 7    9 NaN   5  1   0
## 8    2   2   0  3   8
## 9    8   6   6  0 NaN
## 10   9   7   0  8   8

answered Nov 18 '14 at 07:16

r2evans

141,215
6
77
149

Not sure whats the reasoning of using a loop instead a vectorized approach – David Arenburg Nov 18 '14 at 07:23
This isn't a loop, but if you only want to run this conversion on select columns, you could use a `for` loop to iterate over specific columns, fixing them in-place. – r2evans Nov 18 '14 at 07:25
1

`lapply` *is* a `for` loop – David Arenburg Nov 18 '14 at 07:31
I understand your semantic assertion, so forgive my previous generalization. To me, I combine all of `*apply` as functions that operate on all elements of a vector/list/matrix/data.frame, and literal `for` and `while` to be "for" loops. One rationale against @akrun's vectorized approach is that `is.nan` just won't work on a data.frame. His approach works here, but if more control is necessary, his vectorized approach may be too indiscriminant. That's one reason I prefaced with "alternative approach to @akrun's". (One might argue `lapply` is more like a `foreach` loop, though. ;-) – r2evans Nov 18 '14 at 07:39

akrun · Accepted Answer · 2014-11-18T07:10:31.677

Try

dat[is.na(dat=='NaN')] <- 0
dat
 #   V1  V2  V3 V4  V5
 #1    0   6   0  9   5
 #2  NaN   9   2 10   6
 #3    4 NaN NaN  5   1
 #4   10   4 NaN  9 NaN
 #5    8   6   1  1   6
 #6    7 NaN   7 10 NaN
 #7    9 NaN   5  1   0
 #8    2   2   0  3   8
 #9    8   6   6  0 NaN
 #10   9   7   0  8   8

Or

 indx <- is.na(dat)
 dat[indx][!is.nan(dat[indx])] <- 0

data

set.seed(42)
dat <- as.data.frame(matrix(sample(c(1:10, NA, NaN), 
                            5*10, replace=TRUE), ncol=5))

How to replace only NA data with 0 in R and not the NaN value in a dataframe?

2 Answers2

data