I want like convert a data.table object to the following condition by using table
function. That means, colb
as colnames, and cola
as rownames, and then if the individual col
include corresponding row
, then put the entry as 1
, otherwise 0
. I do it like this:
dt <- data.table(cola = c(1, 1, 2, 3), colb = c(10, 20, 30, 40))
dt
table(dt)
> dt
cola colb
1: 1 10
2: 1 20
3: 2 30
4: 3 40
> table(dt)
colb
cola 10 20 30 40
1 1 1 0 0
2 0 0 1 0
3 0 0 0 1
But when the data set is big, for example 39-million rows by 2 columns in my case, the table operation cost ~80 seconds to finish.
I would like to know is there more efficient method to do the same thing as table function did?
In addition, dcast.data.table(dt, cola ~ colb, fill = 0L)
do the same thing when I try it, but the resutls have a little different which should be futher deal with to get the same results as table function. The important thing is that dcast.data.table
do not improve the speed when I try my data. So, I hope someone can help to work out more efficient method to do the same thing!
Thank you.