scaling r dataframe to 0-1 with NA values

Question

I have seen those solutions:

However the method is not able to work if there are NA values present.

I have try this:

s = append(sort(rexp(100)),rep(NA,30))
o = data.frame(s,s)

range01 <- function(x){
    if (!is.na(x))
    { 
        return(NA)
                    }
    else{
        y =  (x-min(x))/(max(x)-min(x))
        return(y)}

}

xo = apply(o,MARGIN = 2, FUN = range01)

But it doesn't work... Suggestions?

The solution should works on dataframes by apply function

Shouldn't `if (!is.na(x))` be `if (is.na(x))`? Otherwise you are are returning `NA` for all non-`NA` input, and trying to do calculations on `NA` input. — nrussell, Aug 10 '15 at 17:55
@nrussell You are correct. `!is.na()` returns values that aren't `NA` because of the not operator `!`. — N8TRO, Aug 10 '15 at 17:57
i cannot use na.rm = True, because i have to keep the dataframe as he is... but... thanks! — Kaervas, Aug 10 '15 at 18:01
@Kaervas Don't edit your question with the answer. Either accept the answer below. Or if that does not work, add your own answer below and accept it. The community should vote on what it thinks is the best answer. — MrFlick, Aug 10 '15 at 18:18

Gregor Thomas · Accepted Answer · 2015-08-10T18:06:55.217

7

Here's the answer from the 2nd question you link:

function(x) {(x - min(x)) / (max(x) - min(x))}

We can modify this to work with NAs (using the built-in NA handling in min and max

stdize = function(x, ...) {(x - min(x, ...)) / (max(x, ...) - min(x, ...))}

Then you can call it and pass through na.rm = T.

x = rexp(100)
x[sample(1:100, size = 10)] <- NA
stdize(x)  # lots of NA
stdize(x, na.rm = T) # works!

Or, using the o data frame from your question:

o_std = lapply(o, stdize, na.rm = T)

The NAs will still be there at the end.

edited Aug 10 '15 at 18:06

answered Aug 10 '15 at 18:00

Gregor Thomas

136,190
20
167
294

It works for a vector... but not for a dataset... try!: `stdize = function(x, ...) {(x - min(x, ...)) / (max(x, ...) - min(x, ...))} x = rexp(100) x[sample(1:100, size = 10)] <- NA x = data.frame(x,x) xo = apply(x,MARGIN = 2, FUN = stdize) ` – Kaervas Aug 10 '15 at 18:06
Added example, you still have to specify `na.rm = T`. You can pass it through `apply` or `lapply`. – Gregor Thomas Aug 10 '15 at 18:07

scaling r dataframe to 0-1 with NA values

1 Answers1

Linked