I want to get mean of V1
weighted by V2
and split by V3
and V4
, so I tried aggregate
, it doesn't work because weighted.mean
would only accept one parameter from aggregate
so it's the same as mean
:
df <- data.frame(V1=seq(1, 1.5, 0.1),
V2=seq(6, 1, -1),
V3=c('A','A','A','B','B','B'),
V4=c('F','P','P','P','F','F'))
aggregate(cbind(V1, V2) ~ V3 + V4, data=df, weighted.mean)
V3 V4 V1 V2
1 A F 1.00 6.0
2 B F 1.45 1.5
3 A P 1.15 4.5
4 B P 1.30 3.0
So I have to go back to split
,lapply
and deal with many other stuffs
df1 <- expand.grid(unique(df$V3), unique(df$V4))
colnames(df1) <- c('V3','V4')
df1$CrossTerm <- interaction(df1$V3, df1$V4)
temp <- sapply(split(df, interaction(df$V3, df$V4)), function(df) weighted.mean(df$V1, df$V2))
df2 <- data.frame(WeightedAvg=temp)
df2$CrossTerm <- rownames(df2)
df3 <- merge(df1, df2, by="CrossTerm")
df3$CrossTerm <- NULL
df3
V3 V4 WeightedAvg
1 A F 1.000000
2 A P 1.144444
3 B F 1.433333
4 B P 1.300000
But it's a ugly and possibly buggy solution, is there any better solution to get a similar result like this? The format is important and should be as similar to my output as possible(sequence of rows can be different).