I created a means column for a group based on a criterium C. Now I want those means to be filled out over the entire column, even when criterium C does not hold. So basically I want to replace NA
's with the mean value calculated for that group. You can see the grp, val and C colum in the next Data.table
grp val C
1: 1 NA 0
2: 1 NA 0
3: 1 42 1
4: 1 42 1
5: 2 16 1
6: 2 16 1
7: 2 NA 0
8: 2 NA 0
9: 3 32 1
10: 3 32 1
11: 3 32 1
12: 3 32 1
So I want to replace the val
NA
's with the mean value in the same group.
Here is sample code of how I attempt to do it.
Basicly I extract another data.table
, remove the NA
's and duplicates and then try to merge it with the original table.
x <- data.table(grp=c(1,1,1,1,2,2,2,2,3,3,3,3),val=c(NA,NA,42,42,16,16,NA,NA,32,32,32,32),C=c(0,0,1,1,1,1,0,0,1,1,1,1))
y <- x[!is.na(val),]
y <- y[!duplicated(y),]
setkey(x,grp)
setkey(y,grp)
x[y,val:=val,by=grp]
while this does not give any errors it leaves the original column val
untouched. What am I doing wrong? what would be a better approach?