Normalizing numeric variable according to a factor in R

Question

I'm trying to normalize a variable (using the minimum and maximum values) according to a second, variable (a factor).

It'll be clearer using the diamonds dataframe as an example.

This normalizes the carat variable to the 0-1 interval:

di <- diamonds
di$caratn <- (di$carat-min(di$carat))/(max(di$carat)-min(di$carat))

But I would like to do the normalization according to the clarity variable (which is a factor). That is, taking all carat values of a given clarity and normalizing 0-1.

The result would be that the highest carat of clarity SI2 would have a value of 1, and the same thing for the other clarities.

score 1 · Accepted Answer · answered Dec 04 '15 at 16:55

1

Here's a solution using ave():

di <- within(di,caratn <- ave(carat,clarity,FUN=function(x) (x-min(x))/diff(range(x))))

answered Dec 04 '15 at 16:55

Sam Dickson

5,082
1
27
45

Perfect and simple! Thanks! – xgrau Dec 04 '15 at 17:17

Normalizing numeric variable according to a factor in R

1 Answers1