31

I have a function in which I define a data.frame that I use loops to fill with data. At some point I get the Warning message:

Warning messages: 1: In [<-.factor(*tmp*, iseq, value = "CHANGE") : invalid factor level, NAs generated

Therefore, when I define my data.frame, I'd like to set the option stringsAsFactors to FALSE but I don't understand how to do it.

I have tried:

DataFrame = data.frame(stringsAsFactors=FALSE)

And also:

options(stringsAsFactors=FALSE)

What is the correct way to set the stringsAsFactors option?

Daniel Widdis
  • 8,424
  • 13
  • 41
  • 63
VincentH
  • 1,009
  • 4
  • 13
  • 24
  • see http://stackoverflow.com/questions/2851015/r-convert-data-frame-columns-from-factors-to-characters – GSee Jul 18 '12 at 13:23
  • 1
    Additional information for people searching about stringsAsFactors. Since R ver 4.0 release, stringsAsFactors is set FALSE by default. The [R blog article](https://developer.r-project.org/Blog/public/2020/02/16/stringsasfactors/index.html) talks about the historical detail. Also, see [manual of data.frame](https://stat.ethz.ch/R-manual/R-devel/library/base/html/data.frame.html). – toshi-san May 19 '20 at 02:38

1 Answers1

42

It depends on how you fill your data frame, for which you haven't given any code. When you construct a new data frame, you can do it like this:

x <- data.frame(aName = aVector, bName = bVector, stringsAsFactors = FALSE)

In this case, if e.g. aVector is a character vector, then the dataframe column x$aName will be a character vector as well, and not a factor vector. Combining that with an existing data frame (using rbind, cbind or similar) should preserve that mode.

When you execute

options(stringsAsFactors = FALSE)

you change the global default setting. So every data frame you create after executing that line will not auto-convert to factors unless explicitly told to do so. If you only need to avoid conversion in a single place, then I'd rather not change the default. However if this affects many places in your code, changing the default seems like a good idea.

One more thing: if your vector already contains factors, then neither of the above will change it back into a character vector. To do so, you should explicitly convert it back using as.character or similar.

pyrrhic
  • 1,769
  • 2
  • 15
  • 27
MvG
  • 57,380
  • 22
  • 148
  • 276