I'm working with American Community Survey microdata using the survey
package, and am hoping to calculate some basic income inequality statistics. I've set up the following as my design:
testsurv <- svrepdesign(data=test, repweights = test[,8:87], weights = test$HHWT,
combined.weights=TRUE, type = "Fay", rho = 0.5,scale=4/80,
rscales = rep(1, 80), mse=TRUE)
From that, I'd like to calculate gini coefficients by year, as well as quantile ratios of income, also by year. Generating the quantiles and the related errors is straightforward using svyby
and svyquantile
:
quants <- svyby(~INCOME, ~YEAR, testsurvey, svyquantile,
quantiles=c(0.9, 0.75, 0.5, 0.25, 0.1), keep.var=TRUE)
That brings me to my first question: How do I calculate the the standard errors for ratios of income quantiles (e.g. 90/10) if I have the replicate-weight-based errors for each quantile? I tried using svyratio
but that's for the ratios of entire variables, not for selected observations within variables.
Second question: Is there a way to calculate the gini coefficient (with replicate-based errors) within survey
using existing functions like gini
from reldist
? I tried using withReplicates
but it didn't work well, maybe because gini
orders its arguments as variable, then weights, but the instructions for withReplicates
specify the opposite order. I tried both ways but neither worked. For example, this, where HHWT is the sample weights:
> withReplicates(testsurv, gini(~HHWT, ~INCOME))
That yields the following error message:
Error in sum(weights) : invalid 'type' (language) of argument
In addition: Warning message:
In is.na(x) : is.na() applied to non-(list or vector) of type 'language'