0

I'm a newbie in R and I'm trying to improve my skills. At the moment, I'm stuck with a very easy problem (I hope).

Background

In my data, the length of the variables P21, PONDERA and ESTADO are the same...

I'm working with a huge database, and I want to calculate an average income. In my data, P21 refers to the value for an entry in the sample, and what I'm trying to do is to weight it so that it is representative of the entire population. In turn, ESTADO == 1 refers to the person is occupied, that is why the weight is in relation to the busy people. Then I divide it by population and get the average income.

Salario_OP <- Base_total %>%
group_by(ANO4) %>%
summarise(Ingreso = sum(P21*(PONDERA[ESTADO == 1))/sum(PONDERA[ESTADO == 1])) 

I really think it's easy to solve, but the language does not help me understand everything (I'm argentinian). Hope you can help me. Thank you in advance for your help!

NelsonGon
  • 13,015
  • 7
  • 27
  • 57
  • 1
    What is the error you are receiving? Can you provide a representative sample of `Salario_OP`? – Calum You Feb 01 '19 at 23:44
  • Welcome to Stack Overflow! To help others understand your problem and help you, it would be a good idea to share a small example of your data (or at least fake data that looks similar to your real data). For some examples of how to do this, please check out this post ['how to make a great R reproducible example'](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – bschneidr Feb 02 '19 at 01:04

1 Answers1

1

When you subset PONDERA to [ESTADO == 1], it is no longer the same length as P21. You need to subset P21 as well. Try:

Salario_OP <- Base_total %>%
  group_by(ANO4) %>%
  summarise(Ingreso = sum(P21[ESTADO == 1]*(PONDERA[ESTADO == 1]))/sum(PONDERA[ESTADO == 1]))
Amadou Kone
  • 907
  • 11
  • 21