0
df <- data.frame(x=c(1,1,2,2,3,3,3),y=c(1,3,4,3,5,2,3))

I'd to create a column with the scaled values of y for each element of x so when x==1 --> scale(c(1,3)), x==2 --> scale(4,3), etc

This is what I'm trying to achieve

x  y  y2
1  1  -0.7071
1  3   0.7071
2  4   0.7071
2  3  -0.7071
3  5  -0.2182
3  2   1.0910
3  3  -0.8728
HappyPy
  • 9,839
  • 13
  • 46
  • 68

1 Answers1

3

You could apply scale function by group :

This can be done in base R:

df$y2 <- with(df, ave(y, x, FUN = scale))
df

#  x y        y2
#1 1 1 -0.707107
#2 1 3  0.707107
#3 2 4  0.707107
#4 2 3 -0.707107
#5 3 5  1.091089
#6 3 2 -0.872872
#7 3 3 -0.218218

dplyr

library(dplyr)
df %>% group_by(x) %>% mutate(y2 = scale(y))

and in data.table :

library(data.table)
setDT(df)[, y2 := scale(y), x]

data

df <- data.frame(x=c(1,1,2,2,3,3,3),y=c(1,3,4,3,5,2,3))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • dplyr method is incorrect. – nikn8 Apr 21 '20 at 06:27
  • @Neel I get same output in all the 3 methods. If you are using OP's data then note that OP has 6 rows in their data whereas in expected output they have shown 7 rows. Updated the post with data I have used. – Ronak Shah Apr 21 '20 at 06:30
  • damn true and that's y when I used OP's data, the results were not matching. – nikn8 Apr 21 '20 at 06:33