2

(This is a beginner question, but I didn't find an answer elsewhere. Relevant posts include this one, this one, and this one, but not sure how to apply these to my case.)

When I use read.dta to import STATA format data to R, there is a warning:

> lca <- read.dta("trial.dta")

Warning message:
In `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels) else 
paste0(labels,  :
  duplicated levels in factors are deprecated

Does it simply mean that the variables ("factors" in R) contain duplicate values? If so, why is this even a warning -- isn't this expected of most variables?

Community
  • 1
  • 1
Simon C
  • 21
  • 2
  • 2
    no it means something like `factor(1:3, levels = c(1,1,3))` happened – rawr Jan 07 '17 at 03:32
  • Make sure no values got turned into `NA`, but you can clean up duplicates with `droplevels`. Or maybe try a different Stata reading function like `haven::read_dta`. – alistaire Jan 07 '17 at 03:55
  • Thank you both! I tried the haven::read.dta function and the warning went away. – Simon C Jan 07 '17 at 19:15
  • The issue was caused by a country name variable that has a lot of zeros. (If I drop the variable, the warning goes away). Is it because the zeros were deemed as duplicate levels and "deprecated" into the same level? – Simon C Jan 07 '17 at 19:17

1 Answers1

0

Try this :

don <- read.dta("trial.dta",convert.dates = T,convert.factors = F)
for(i in 1:ncol(don)){
    valuelabel<-attributes(don)[[6]][i]
    if(valuelabel!=""){
       label<-paste("names(attributes(don)[[11]]$",valuelabel,")",sep="")
       level<-paste("attributes(don)[[11]]$",valuelabel,sep="")
       labels=(eval(parse(text=label)))
       levels=(eval(parse(text=level)))
       if(sum(duplicated(labels)) > 0){
          doublon<-which(duplicated(labels))
          remplace<-levels[doublon]
          valueremplace<-levels[unique(labels)==names(remplace)]
          don[don[,i]%in%remplace,i]<-valueremplace
          labels<-unique(labels)
          levels<-levels[labels]
       }
    don[,i]<-factor(don[,i],levels=levels,labels=labels)
  }
}