I have a very odd problem. I'm importing some factor variables from Stata into R using readstata13
package. The imported labels/levels look ok, but they change when removing factor class. Here is the Stata description of the variable (here is the data for reproducibility):
Notice some labels are missing (UPDATE: actually, they are not missing. Rather, they are filled with a space, an odd way the coder used to highlight missing label). Notice also variable value 13 has 7 observations.
So I import the data in R and check levels and frequency. All fine:
Then I remove the levels using as.integer()
(or as.numeric()
), but things mess up. In particular values 11, 12 and 13. Notice now 11 has 7 observations, rather than 13:
The problem remains, regarding of read.dta13
options related to factors. I tried the second suggestion in this answer, using the following code, but did not work (most likely because only two values have labels):
labname <- get.label.name(data,"J_Itm1")
labtab <- get.label(data, labname)
table(get.origin.codes(data$J_Itm1, labtab))
Any idea how to solve the problem?