2

I have few columns which I need to convert to factors

for cols in ['col1','col2']:
  df$cols<-as.factor(as.character(df$cols))

Error

for cols in ['col1','col2']:
Error: unexpected symbol in "for cols"
>   df$cols<-as.factor(as.character(df$cols))
Error in `$<-.data.frame`(`*tmp*`, cols, value = integer(0)) : 
  replacement has 0 rows, data has 942
noob
  • 3,601
  • 6
  • 27
  • 73

2 Answers2

1

The syntax showed also use the python for loop and python list. Instead it would be a vector of strings in `R

for (col in c('col1','col2')) {
       df[[col]] <- factor(df[[col]])
  }

NOTE: here we use [[ instead of $ and the braces {}. The factor can be directly applied instead of as.character wrapping


Or with lapply where it can be done easily (without using any packages)

df[c('col1', 'col2')] <- lapply(df[c('col1', 'col2')], factor)

Or in dplyr, where it can be done more easily

library(dplyr)
df <- df %>%
          mutate_at(vars(col1, col2), factor)
akrun
  • 874,273
  • 37
  • 540
  • 662
1

In complement to @akrun solution, with data.table, this can be done easily:

library(data.table)
setDT(df)
df[,c("col1","col2") := lapply(.SD, function(c) as.factor(as.character(c))), .SDcols = c("col1","col2")]

Note that df is updated by reference (:=) so no need for reassignment

linog
  • 5,786
  • 3
  • 14
  • 28