4

I have referred convert data.frame column format from character to factor and Converting multiple data.table columns to factors in R and Convert column classes in data.table

Unfortunately it did not solve my problem. I am working with the bodyfat dataset and my dataframe is called > bf. I added a column called agegrp to categorize persons of different ages as young, middle or old thus :

bf$agegrp<-ifelse(bf$age<=40, "young", ifelse(bf$age>40 & bf$age<55,"middle", "old"))

This is the ctree analysis:

> set.seed(1234)
> modelsample<-sample(2, nrow(bf), replace=TRUE, prob=c(0.7, 0.3))
> traindata<-bf[modelsample==1, ]
> testdata<-bf[modelsample==2, ]
> predictor<-agegrp~DEXfat+waistcirc+hipcirc+kneebreadth` and ran, `bf_ctree<-ctree(predictor, data=traindata)
> bf_ctree<-ctree(predictor, data=traindata)

I got the following error:

Error in trafo(data = data, numeric_trafo = numeric_trafo, factor_trafo = factor_trafo,  : 
  data class character is not supported
In addition: Warning message:
In storage.mode(RET@predict_trafo) <- "double" : NAs introduced by coercion

Since bf$agegrp is of class "character" I ran,

> bf$agegrp<-as.factor(bf$agegrp)

the agegrp column is now coerced to factor.

> Class (bf$agegrp) gives [1] "Factor".

I tried running the ctree again, but it throws the same error. Does anyone know what the root-cause of the problem is?

Community
  • 1
  • 1
vagabond
  • 3,526
  • 5
  • 43
  • 76
  • Try `br$agegrp <- as.factor(bf$agegrp)`. – jlhoward Mar 01 '14 at 21:44
  • that will create a new variable. How Will I use it in my existing bf dataframe to run the c-tree? – vagabond Mar 01 '14 at 21:49
  • @jlhoward `bf$agegrp<-as.factor(bf$agegrp)` returns bf$agegrp as factor but my ctree is returning the same error! – vagabond Mar 01 '14 at 21:53
  • Sorry, typo: I meant `bf$agegrp <- as.factor(bf@agegrp)` – jlhoward Mar 01 '14 at 21:53
  • Gonna try taking the whole thing from the top! – vagabond Mar 01 '14 at 21:55
  • I don't quite follow. You're passing traindata to ctree, but you're making changes in bf...? – joran Mar 01 '14 at 22:09
  • @joran I created two partitions of the dataframe as traindata and testdata using this: `set.seed(1234)`, `modelsample<-sample(2, nrow(bf), replace=TRUE, prob=c(0.7, 0.3))`, `traindata<-bf[modelsample==1, ]` and `testdata<-bf[modelsample==2, ]` , I'll add that to my question. – vagabond Mar 01 '14 at 22:13
  • How is `traindata` defined? If I run `bf_ctree<-ctree(predictor, data=bf)`, I don't get an error. – jlhoward Mar 01 '14 at 22:20
  • @jlhoward I just updated it in the question, all the commands in the ctree analysis – vagabond Mar 01 '14 at 22:27

1 Answers1

3

This works for me:

library(mboot)
library(party)
bf <- bodyfat
bf$agegrp <- cut(bf$age,c(0,40,55,100),labels=c("young","middle","old"))
predictor <- agegrp~DEXfat+waistcirc+hipcirc+kneebreadth

set.seed(1234)
modelsample <-sample(2, nrow(bf), replace=TRUE, prob=c(0.7, 0.3))
traindata   <-bf[modelsample==1, ]
testdata    <-bf[modelsample==2, ]
bf_ctree    <-ctree(predictor, data=traindata)
plot(bf_ctree)

jlhoward
  • 58,004
  • 7
  • 97
  • 140
  • bang on! the problem seemed to be in the way I added the variable `$agegrp` with `ifelse`. Any idea why that may have been causing problem? – vagabond Mar 01 '14 at 22:46
  • Not really. The factor levels are in a different order with `as.factor(ifelse(...ifelse(...)))`. Perhaps that has something to do with it. – jlhoward Mar 01 '14 at 22:52