Standardizing (z-score) multiple variabels at once

Question

I have a dataset that looks something like this: but with hundred of variables

set.seed(123)
df <- data.frame(id= c(1,1,1,2,2,2,3,3,3), time=c(1,2,3,1,2,3,1,2,3),y = rnorm(9), x1 = rnorm(9), x2 = c(0,0,0,0,1,0,1,1,1), x3 = rnorm(9), c1 = rnorm(9),  c2 = rnorm(9))

I would like to standardize all my variables to ease interpretation after regression. I know I could standardize variable one by one using BBmisc

library(BBmisc)
df$z_y <- normalize(df$y, method = "standardize")

But this would result quite tedious long and disorganized in the command file.

Since I am not really able to use loops or functions, I was wondering whether someone would know how to do it in a single (few) lines. Potentially selecting the relevant variables to standardize.

Also, it would be good if the function was able to detect dummies (x2) and avoid standardizing those

I thank you in advance for your help

It's already vectorized, try `normalize(df[3:8], method="standardize")`. — jay.sf, Oct 21 '19 at 16:58
Also, if it's just a z-score, then you don't really need a specific package. You can do `data.frame(lapply(df[3:8], function(x) (x - mean(x))/sd(x)))`. — tmfmnk, Oct 21 '19 at 16:59
why not use base R function `scale`?? ie `scale(df[-(1:2)])` — Onyambu, Oct 21 '19 at 16:59
Is there a way that I can use so that R automatically detects dummies or character strings and avoid standardizing those? — Alex, Oct 21 '19 at 17:04
Yeah, supposed only your variables to be normalized are numeric (do `df[1:2] <- lapply(df[1:2], as.factor)` for testing), you could do `scale(df[lapply(df, class) == "numeric"])`. — jay.sf, Oct 21 '19 at 17:30
Possible duplicate of [Standardize data columns in R](https://stackoverflow.com/questions/15215457/standardize-data-columns-in-r) — M--, Oct 21 '19 at 17:43
or with `dplyr` you can do `mutate_if(df, is.numeric, scale)` — , Oct 21 '19 at 17:57

Standardizing (z-score) multiple variabels at once

0 Answers0