I have a data.table
DT
and I want to run model.matrix
on it. Each row has a string ID, which is stored in the ID
column of DT
. When I run model.matrix
on DT
, my formula excludes the ID
column. The problem is, model.matrix
drops some rows because of NAs. If I set the rownames of DT
to the ID
column, before calling model.matrix
, then the final model matrix has rownames, and I'm all set. Otherwise, I can't figure out what rows I end up with. I'm setting the rownames with rownames(DT) = DT$ID
. However, when I try to add a new column to DT
, I get a complaint about
"Invalid .internal.selfref detected . . . At an earlier point, this data.table has been copied by R."
So I'm wondering
- Is there a better way to set rownames for a
data.table
- Is there a better approach to solving this problem.