7

i am trying to get all the colums of my data frame to be in the same scale..

right now i have something like this... where a is on a 0-1 scale b is on a 100 scale and c is on a 1-5 scale

a   b     c 
0   89   4 
1   93   3 
0   88   5

How would i get it to a 100scale like this...

a     b      c 
0     89     80 
100   93     60 
0     88     100 

i hope that is somewhat clear.. i have tried scale() but can not seem to get it to work.

Matt.G
  • 95
  • 1
  • 1
  • 4

3 Answers3

18

Using scale, if dat is the name of your data frame:

## for one column
dat$a <- scale(dat$a, center = FALSE, scale = max(dat$a, na.rm = TRUE)/100)
## for every column of your data frame
dat <- data.frame(lapply(dat, function(x) scale(x, center = FALSE, scale = max(x, na.rm = TRUE)/100)))

For a simple case like this, you could also write your own function.

fn <- function(x) x * 100/max(x, na.rm = TRUE)
fn(c(0,1,0))
# [1]   0 100   0
## to one column
dat$a <- fn(dat$a)
## to all columns of your data frame
dat <- data.frame(lapply(dat, fn))
Blue Magister
  • 13,044
  • 5
  • 38
  • 56
5

My experience is that this is still unanswered, what if one of the columns had a -2, the current answer would not produce a 0-100 scale. While I appreciate the answer, when I attempted it, I have variables that are -100 to 100 and this left some negative still?

I have a solution in case this applies to you:

rescale <- function(x) (x-min(x))/(max(x) - min(x)) * 100
dat <- rescale(dat)
J Walt
  • 141
  • 1
  • 9
0

Even more simple and flexible to other scales is the rescale() function from the scales package. If you wanted to scale from 3 to 50 for some reason, you could set the to parameter to c(3,50) instead of c(0,100) here. Additionally, you can set the from parameter if your data needs to fit to the scale of another dataset (i.e. the min/max of your data should not equal the min/max of the scale you want to set). Here I've provided an example where 0 would be the midpoint between -100 to 100, so rescaling to 0:100 would now place 0 at 50 (the halfway point).

# 0 to 100 scaling
rescale(1:10, to = c(0,100))
# [1]   0.00000  11.11111  22.22222  33.33333  44.44444  55.55556  66.66667  77.77778  88.88889
# [10] 100.00000

# use 'from' to indicate the extended range of values
rescale(seq(0,100,10), to = c(0,100), from = c(-100,100))
# [1]  50  55  60  65  70  75  80  85  90  95 100
Eric
  • 39
  • 5