The data is just like this:
View(df)
row Events
1 237,2,236,102,106,111,114,115,116,117,118,119,125
2 237,111,116
3 102,106,111,114,115
I got around the 3.5 million rows, and what I want is to create new binary columns, like this:
row 237 2 236 102 106 111 114 115 116 117 118 119 125 126
1 1 1 1 1 1 1 1 1 1 1 1 1 1 0
2 1 0 0 0 0 1 0 0 1 0 0 0 0 0
3 0 0 0 1 1 1 1 1 0 0 0 0 0 0
I used the same solution as here: Create new columns with dummies based on values which is:
Event <- as.data.frame.matrix(table(stack(setNames(strsplit(df$event, ","), df$row))[2:1]))
And it worked on a small data set. But with the 3.5 million rows I got the error:
Error in table(stack(setNames(strsplit(data$event, ","), data$row))[2:1]) :
attempt to make a table with >= 2^31 elements
I think the error is because I'm making the table too big. But I really need those columns. How can I fix this?