I'm trying to do a simple thing, divide 40 columns of a data.table by their mean. I cannot provide the actual data (not all columns are numeric, and I have > 8M rows), but here's an example:
library(data.table)
dt <- data.table(matrix(sample(1:100,4000,T),ncol=40))
colmeans <- colMeans(dt)
Next I thought I would do:
for (col in names(colmeans)) dt[,col:=dt[,col]/colmeans[col]]
But this returns an error since dt[,col]
require that column names are not quoted. Using as.name(col)
doesn't cut it.
Now,
res <- t(t(dt[,1:40,with=F]/colmeans))
contains the expeded result, but I cannot insert it back in the data.table, as
dt[,1:40] <- res
does not work, neither does dt[,1:40:=res, with=F]
.
The following works, but I find it quite ugly:
for (i in seq_along(colmeans)) dt[,i:=dt[,i,with=F]/colmeans[i],with=F]
Sure, I could also recreate an new data.table by calling data.table()
on res
and the other non-numerical columns my data.table has, but isn't their anything more efficient?