1

I have a dataframe which contains both numeric variables and factors.

When moving data from one dataframe to another, everything is kept as I would like it:

copy_data<-as.data.frame(original_data)

This creates a copy of 'original_data' with factors remaining factors.

When I try a more complex version, the end result is a dataframe of numeric values, when I want the factors to still be factors:

model_data<-with(subset(copy_data, copy_data$var1<0), 
as.data.frame(cbind(var1, var2, var3, factor1, factor2, factor3)))

So factor1, factor2, and factor3 all end up numeric rather than factors. What am I missing? I've tried with and without as.data.frame and defining model_data as a dataframe before populating it.

My searches of the StackExchange archive return mostly results about deliberately changing factors to variables, and haven't help me much. The slightly clunky title isto differentiate my question from those.

Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
gisol
  • 754
  • 3
  • 8
  • 20
  • 1
    Welcome to StackOverflow. Perhaps if you made a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) that demonstrates your question / problem, people would find it easier to answer. A reproducible example in your case will contain some sample data that demonstrates the problem. – Andrie Aug 02 '12 at 16:05

1 Answers1

1

?cbind says that cbind returns a matrix if all the inputs are vectors (which they are in your case). A a matrix can can only contain a single atomic type (character, numeric, logical, etc.). Factors are not an atomic type, so they get converted.

The "Data frame methods" section says that cbind data.frame method just wraps data.frame(..., check.names=FALSE), so you could just call data.frame directly (the call to cbind is redundant).

model_data <- with(subset(copy_data, copy_data$var1<0), 
  data.frame(var1, var2, var3, factor1, factor2, factor3))
Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
  • Brilliant, works perfectly - I had assumed the `cbind` was necessary after reading the `as.data.frame` documentation and hadn't thought to try with out, only to try `c` which obviously doesn't work. – gisol Aug 02 '12 at 16:17