0

I have df:

x <- c(1,1,2,2,3,3,4,5)
y <- c(1,1,2,3,3,3,4,4)
freq <- c(4,6,7,2,2,6,5,1)
distmean <-c(2,4,5,7,3,5,2,7)
df <- data.frame(x,y,freq,distmean)


x      y   freq   distmean
1      1      4          2
1      1      6          4 
2      2      7          5
2      3      2          7
3      3      2          3
3      3      6          5 
4      4      5          2 
5      4      1          7

I want to aggregate the rows based on x and y, with the sum of freq and a mean of distmean weighted by freq.

So in the end I want:

x      y   freq    distmean
1      1     10         3.2
2      2      7         5.0
2      3      2         7.0
3      3      8         4.5
4      4      5         2.0 
5      4      1         7.0

I tried using aggregate() which I can combine the duplicated rows with, but can't figure out a weighted mean.

thelatemail
  • 91,185
  • 12
  • 128
  • 188
Ariel Kaputkin
  • 47
  • 1
  • 1
  • 5

1 Answers1

0

This may not be a short method to do this. But can be done this way if you want to avoid complex functions.

df2=a=aggregate(freq ~ x+y, data=df, sum, na.rm=TRUE) df$dist=df$distmean*df$freq b=aggregate(dist ~ x+y, data=df, sum, na.rm=TRUE) df2$distmean=(b/a)[3]

Yogesh
  • 1,384
  • 1
  • 12
  • 16