0

I have a problem with raking data with missing values. Missing values can only be found in "plec" column. Could you please help me overcome this problem?

That's the code that I have:


library(readxl)
DATASET <- read_excel("C:/Users/Mateusz/Desktop/25.10/Nowy Arkusz programu Microsoft Excel.xlsx")
DATA <- as.data.frame(DATASET)
data.svy.unweighted <- svydesign(ids=~1, data=DATA)
plec.dist <- data.frame(plec=c("k","m"), Freq=nrow(DATA)*c(.49,.51))
miasto.dist  <- data.frame(miasto=c(1,2,3), Freq=nrow(DATA)*c(.64,.12,.24))
wiek.dist  <- data.frame(wiek=c(1,2), Freq=nrow(DATA)*c(.7,.3))
data.svy.rake <- rake(design = data.svy.unweighted,sample.margins <- list(~plec,~miasto,~wiek),population.margins <- list(plec.dist, miasto.dist, wiek.dist))
Error in na.fail.default(list(plec = c("m", "k", "k", "m", "m", "k", "k",  : 
  brakujące wartości w argumencie 'object'

  • Hi, since we can't access your dataset, your post isn't reproducible. Please check this on how to write a reproducible example: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – jrcalabrese Oct 25 '22 at 15:28
  • Give the ouput of `dput(DATA)` – Julien Oct 25 '22 at 15:35
  • Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. – Community Oct 25 '22 at 16:48

1 Answers1

0

You can't have missing data for the variables in sample.margins (as the documentation for rake says). If you don't know which cell some observations are in, you can't work out the raking weights.

Perhaps you want to rake just the observations that have non-missing values. In that case, use subset to restrict your design to just those observations before calling rake.

Thomas Lumley
  • 1,893
  • 5
  • 8