I'm attempting to add a column to a data frame that consists of normalized values by a factor.
For example:
'data.frame': 261 obs. of 3 variables:
$ Area : Factor w/ 29 levels "Antrim","Ards",..: 1 1 1 1 1 1 1 1 1 2 ...
$ Year : Factor w/ 9 levels "2002","2003",..: 1 2 3 4 5 6 7 8 9 1 ...
$ Arrests: int 18 54 47 70 62 85 96 123 99 38 ...
I'd like to add a column that are the Arrests values normalized in groups by Area.
The best I've come up with is:
data$Arrests.norm <- unlist(unname(by(data$Arrests,data$Area,function(x){ scale(x)[,1] } )))
This command processes but the data is scrambled, ie, the normalized values don't match to the correct Areas in the data frame.
Appreciate your tips.
EDIT:Just to clarify what I mean by scrambled data, subsetting the data frame after my code I get output like the following, where the normalized values clearly belong to another factor group.
Area Year Arrests Arrests.norm
199 Larne 2002 92 -0.992843957
200 Larne 2003 124 -0.404975825
201 Larne 2004 89 -1.169204397
202 Larne 2005 94 -0.581336264
203 Larne 2006 98 -0.228615385
204 Larne 2007 8 0.006531868
205 Larne 2008 31 0.418039561
206 Larne 2009 25 0.947120880
207 Larne 2010 22 2.005283518