R- Normalizing certain columns from 0 to 1 with values equal to 0

Question

I recently started with are and I would like to scale my data matrix. I found a way to do that here Scale a series between two points

x <- data.frame(step = c(1,2,3,4,5,6,7,8,9,10))
normalized <- (x-min(x))/(max(x)-min(x))

As my data consists of several columns whereof I only want to normalize certain columns using a function was suggested.

normalized <- function(x) (x- min(x))/(max(x) - min(x))
x[] <- lapply(x, normalized)

Additionally, I realized that some of the data points in my dataset equal 0 such that the presented formula doesn't work anymore. I added an extension suggested here: scaling r dataframe to 0-1 with NA values

normalized <- function(x, ...) {(x - min(x, ...)) / (max(x, ...) - min(x, ...))}

But I don't understand how I have to code it. For example, I would like to have column 4,5,6 and 10 normalized but I would like to have the remaining columns as they were in the data set? I tried it for column 4:

data <- lapply(data[,4],normalized,na.rm= TRUE)

But it did not work (instead of a data frame a list resulted :-(...), does anybody knows how I could fix it?

Thanks a lot already in advance!

what exactly is your desired output? Please add this to your question. — Andre Elrico, Feb 15 '18 at 09:17
`data[, c(4, 5, 6, 10)] <- lapply(data[, c(4, 5, 6, 10)], normalized)`? — Cath, Feb 15 '18 at 09:18
Thanks a lot!! It worked and was actually super simple! But I am a total newbie :-D — Jael, Feb 15 '18 at 09:37

score 6 · Answer 1 · answered Feb 15 '18 at 09:23

6

Try this, I have modified normalized function considering NA values:

db<-data.frame(a=c(22,33,28,51,25,39,54,NA,50,66),
               b=c(22,33,NA,51,25,39,54,NA,50,66))

normalized<-function(y) {

  x<-y[!is.na(y)]

  x<-(x - min(x)) / (max(x) - min(x))

  y[!is.na(y)]<-x

  return(y)
  }

 apply(db[,c(1,2)],2,normalized)

Your output:

               a          b
 [1,] 0.00000000 0.00000000
 [2,] 0.25000000 0.25000000
 [3,] 0.13636364         NA
 [4,] 0.65909091 0.65909091
 [5,] 0.06818182 0.06818182
 [6,] 0.38636364 0.38636364
 [7,] 0.72727273 0.72727273
 [8,]         NA         NA
 [9,] 0.63636364 0.63636364
[10,] 1.00000000 1.00000000

answered Feb 15 '18 at 09:23

Terru_theTerror

4,918
2
20
39

3

there was no need to modify the function, `normalized <- function(x, ...) {(x - min(x, ...)) / (max(x, ...) - min(x, ...))}` handles the `NA` perfectly when putting `na.rm=TRUE` as argument. (Also, `apply` will convert to matrix) – Cath Feb 15 '18 at 09:33
What does the 2 mean in ```apply(db[,c(1,2)],2,normalized)``` ? In my case I can also fill in 1 and I don't know what I should do. – Rivered Oct 07 '22 at 09:38
From apply documentation: a vector giving the subscripts which the function will be applied over. E.g., for a matrix 1 indicates rows, 2 indicates columns, c(1, 2) indicates rows and columns. – Terru_theTerror Oct 10 '22 at 14:26

R- Normalizing certain columns from 0 to 1 with values equal to 0

1 Answers1

Linked