0

How can I scale INT values in a data frame to values from 0 to 100? For Example this DF:

> employee <- c('John Doe','Peter Gynn','Jolie Hope')
> value <- c(1, 3, 365)
> startdate <- as.Date(c('2010-11-1','2008-3-25','2007-3-14'))
> employ.data <- data.frame(employee, value, startdate)
> employ.data$value <- as.integer(employ.data$value)

How can I scale the value to a range between 0 and 100. The threshold should be 50. So my output should look like this:

> employ.data
employee value  startdate
1   John Doe  1 2010-11-01
2 Peter Gynn  3 2008-03-25
3 Jolie Hope  100 2007-03-14
Timothy_Goodman
  • 393
  • 1
  • 5
  • 18
  • I changed the name because of the low numbers :) – Timothy_Goodman Jan 15 '19 at 14:37
  • pmax changed nothing in my df? – Timothy_Goodman Jan 15 '19 at 14:37
  • 1
    `employ.data$value <- pmin(employ.data$value, 100)` – Ronak Shah Jan 15 '19 at 14:38
  • I think you want something like: `(value-min(value))/(max(value)-min(value))*100` – Dave2e Jan 15 '19 at 14:43
  • It is indeed a duplicate. My answer below refers to [an answer](https://stackoverflow.com/a/22075339/2433233) provided there. Sorry, I just saw that. – swolf Jan 15 '19 at 14:47
  • 1
    Please clarify what you want done. Typically, *scale* refers to a linear transformation that affects all values the same way, as Deve2e says it is usually `(x - min(x)) / (max(x) - min(x)) * desired_max`, sometimes subtracting `mean(x)` instead of `min(x)` (called *centering and scaling*). In your desired result, only the 365 value is changed, so questions: (a) what do you mean by "the threshold should be 50"? (b) Do you only want to change values over 50? (c) Do you only want to change values over 100? (d) If your input data was `x = c(1, 2, 50, 51, 100, 200)`, what result would you want? – Gregor Thomas Jan 15 '19 at 14:49

2 Answers2

2
library(scales)
employ.data$scaled.value <- rescale(employ.data$value, from = c(1, 365), to = c(0, 100))

Do you see what happened? The rescale() function takes the "old" scale as argument "from" and the "new" scale as argument "to". Then it does its magic and the values in employ.data$scaled.value are on the new scale.

I don't understand what you mean by "The threshold should be 50", though.

swolf
  • 1,020
  • 7
  • 20
1

With treshold I mean, that all values < 50 keep their value.

I got it now with:

employ.data$value <- replace(employ.data$value, employ.data$value > 50, 100)

And my result:

> employ.data
employee value  startdate
1   John Doe     1 2010-11-01
2 Peter Gynn     3 2008-03-25
3 Jolie Hope   100 2007-03-14 
Timothy_Goodman
  • 393
  • 1
  • 5
  • 18