How can I rearrange a data.table by id, so that the columns within the same id are put next to each other? The following code as a short example of input and desired output
require(data.table)
set.seed(456)
DT <- data.table(id = c(1,1,1, 2,2, 3,3, 4,4,4, 5),
X = round(runif(11, 0, 1),2),
Y = round(runif(11, 5, 10)),
A = c(rep(9, 3), rep(10, 2), rep(11, 2), rep(12, 3), 13))
Input
DT
# id X Y A
# 1: 1 0.09 6 9
# 2: 1 0.21 9 9
# 3: 1 0.73 9 9
# 4: 2 0.85 8 10
# 5: 2 0.79 8 10
# 6: 3 0.33 9 11
# 7: 3 0.08 7 11
# 8: 4 0.29 9 12
# 9: 4 0.24 6 12
# 10: 4 0.39 6 12
# 11: 5 0.37 9 13
# ...
I want something like this:
for every subset by id: After all constant variables (within each id, here: A and id), put all different variables (within each id, here: X- and Y-columns) in the same row by creating new columns (X -> X1, X2, X3, Y -> Y1, Y2, Y3)
The occurences by id are different and vary in my case between 1 and 3, so the missing values for all ids with occurences less than 3 should be filled with NA
.
Desired output
# id A X1 Y1 X2 Y2 X3 Y3
# 1 9 0.09 6 0.21 9 0.73 9
# 2 10 0.85 8 0.79 8 NA NA
# 3 11 0.33 9 0.08 7 NA NA
# 4 12 0.29 9 0.24 6 0.39 6
# 5 13 0.37 9 NA NA NA NA
# ...
It is easy to get the number of items in every subset with DT[,.N, by = id]
, but I struggle with the next steps. Of course the number of columns and rows is much larger, so that I would prefer solutions with less naming of the columns (but of course somewhere the constant and different columns should be defined).