I currently am trying to execute the anesrake function (part of the anesrake package https://cran.r-project.org/web/packages/anesrake/index.html which weights population attribute sets based on sample attribute sets) within R to approximate weight rankings for multiple sets of variables.
I have a table of sample data testData:
Index GENDER AGE
1 Female 18-24
2 Female 35-64
3 Male 65+
Note: age range has 6 levels- 18-24,25-34,35-44,45-54,55-64,65+
I then have a set of 2 lists for my population data:
GENDER <- c(.49,.51)
AGE <- c(.08,.1,.12,.2,.2,.3)
I then create a set of target variables and a CASEID column on the original table:
targets <- list(GENDER, AGE)
names(targets) <- c("GENDER", "AGE")
testData$CASEID <- 1:length(testData$GENDER)
I finally get to see the variance in my population data vs my sample data:
> anesrakefinder(targets, testData, choosemethod = "total")
GENDER AGE
0.1495337 0.3668394
But when I use the anesrake function to do the final analysis, I get thrown errors:
> anesrake(inputter=targets,dataframe=testData,caseid=testData$CASEID)
Error in rakeonvar.default(mat[, i], inputter[[i]], weightvec) :
number of variable levels does not match number of weighting levels
In addition: Warning message:
In rakeonvar.default(mat[, i], inputter[[i]], weightvec) :
NAs introduced by coercion
I've been following two 'tutorials' on how to utilize anesrake but I'm still coming up short. These are the tutorials below:
http://sdaza.com/survey/2012/08/25/raking/
http://surveyinsights.org/wp-content/uploads/2014/07/Full-anesrake-paper.pdf
Any help that you could provide on this would be greatly, greatly appreciated.
Cheers,
Stu