1

Let DT be a data.table:

DT<-data.table(V1=factor(1:10),
           V2=factor(1:10),
           ...
           V9=factor(1:10),)

Is there a better/simpler method to do multicolumn factor conversion like this:

DT[,`:=`(
  Vn1=as.numeric(V1),
  Vn2=as.numeric(V2),
  Vn3=as.numeric(V3),
  Vn4=as.numeric(V4),
  Vn5=as.numeric(V5),
  Vn6=as.numeric(V6),
  Vn7=as.numeric(V7),
  Vn8=as.numeric(V8),
  Vn9=as.numeric(V9)
)]

Column names are totally arbitrary.

Frank
  • 66,179
  • 8
  • 96
  • 180

1 Answers1

4

Yes, the most efficient would be probably to run set in a for loop

Set the desired columns to modify (you can chose all the names too using names(DT) instead)

cols <- c("V1", "V2", "V3") 

Then just run the loop

for (j in cols) set(DT, i = NULL, j = j, value = as.numeric(DT[[j]]))

Or a bit less efficient but more readable way would be just (note the parenthesis around cols which evaluating the variable)

## if you chose all the names in DT, you don't need to specify the `.SDcols` parameter
DT[, (cols) := lapply(.SD, as.numeric), .SDcols = cols] 

Both should be efficient even for a big data set. You can read some more about data.table basics here


Though beware of converting factors to numeric classes in such a way, see here for more details

Community
  • 1
  • 1
David Arenburg
  • 91,361
  • 17
  • 137
  • 196