Issues regarding the command by
and weighted.mean
already exist but none was able to help solving my problem. I am new to R and am more used to data mining language than programming.
I have a data frame with for each individual (observation/row) the income, education level and sample weight. I want to calculate the weighted mean of income by education level, and I want the result to be associated to each individual in a new column of my original data frame, like this:
obs income education weight incomegroup
1. 1000 A 10 --> display weighted mean of income for education level A
2. 2000 B 1 --> display weighted mean of income for education level B
3. 1500 B 5 --> display weighted mean of income for education level B
4. 2000 A 2 --> display weighted mean of income for education level A
I tried:
data$incomegroup=by(data$education, function(x) weighted.mean(data$income, data$weight))
It does not work. The weighted mean is calculated somehow and appears in column "incomegroup" but for the whole set instead of by group or for one group only, I don't know. I read things regarding packages plyr
or aggregate
but it does not seem to do what I am interested in.
The ave{stats}
command gives exactly what I am looking for but only for simple mean:
data$incomegroup=ave(data$income,data$education,FUN = mean)
It cannot be used with weights.
Thanking you in advance for your help!