0

My task would be to stratify a regression task, the data looks like

f1,f2,f3,... m1,m2...,p1,p2,p3...

where f_i are numerical and the other columns are factors and integers.

Now I define a self defined measure m1, after running the following

measures1 = list(m1, medae) 
measures2 = lapply(measures1, setAggregation, train.mean)
measures = c(measures1, measures2)
# rdesc = makeResampleDesc("CV", iters = 3, predict = "both", stratify.cols = "Iodine" ) #Default is 2/3, both=train&test
rdesc = makeResampleDesc("CV", iters = 3, predict = "both" ) #Default is 2/3

I got the error saying

[Resample] cross-validation iter: 1
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
  contrasts can be applied only to factors with 2 or more levels

When I subset the input dataframe to contain only numerical data, there is no such error, in fact only the numerical data are useful for prediction, but I need the other columns to stratify in the train-test split. Anyone knows what is wrong?

sunxd
  • 743
  • 1
  • 9
  • 24

1 Answers1

0

It turns out using

rapply(dat,function(x)length(unique(x)))

I could find out that there is one column with only 1 unique value, problem solved.

sunxd
  • 743
  • 1
  • 9
  • 24