0

Hi I am trying to do best subset selection with nhanes2003-2004 dataset.

load("/Users/nhanes2003-2004.Rda") regfit.full=regsubsets(RIDAGEEX~.,data=nhanes2003_2004)

And I keep getting this error message

Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels

Here is a snapshot of the data, which can also be acessed through https://wwwn.cdc.gov/nchs/nhanes/ContinuousNhanes/Default.aspx?BeginYear=2003 enter image description here

How do I check the level of factor and avoid this error message? Thank you!

user3349164
  • 31
  • 1
  • 4

1 Answers1

0

One option could be to select only those columns having unique values more than 1 in a column.

# Columns having unique values more than one 
validcols <- sapply(nhanes2003_2004, 
    function(x)length(unique(x[!is.na(x)])) > 1)

#Select only valid columns 
df <- nhanes2003_2004[,validcols]

#Perfomr analysis on df
MKR
  • 19,739
  • 4
  • 23
  • 33