Say I have a data.table
where C columns hold discrete values among N possible values:
set.seed(123)
datapoints = data.table(replicate(3, sample(0:5, 4, rep=TRUE)))
print(datapoints)
V1 V2 V3
1: 1 5 3
2: 4 0 2
3: 2 3 5
4: 5 5 2
(here C=3 and N=5)
I want to add N columns, each containing TRUE
if one of the C columns contains the Nth value, FALSE
otherwise:
V1 V2 V3 has0 has1 has2 has3 has4 has5
1: 1 5 3 FALSE TRUE FALSE TRUE FALSE TRUE
2: 4 0 2 TRUE FALSE TRUE FALSE TRUE FALSE
3: 2 3 5 FALSE FALSE TRUE TRUE FALSE TRUE
4: 5 5 2 FALSE FALSE TRUE FALSE FALSE TRUE
I have tried this:
for (value in 0:5) {
datapoints <- datapoints[, (paste("has", value, sep="")) := (value %in% .SD), .SDcols = c("V1", "V2", "V3")]
}
The columns are added but filled with FALSE
:
V1 V2 V3 has0 has1 has2 has3 has4 has5
1: 1 5 3 FALSE FALSE FALSE FALSE FALSE FALSE
2: 4 0 2 FALSE FALSE FALSE FALSE FALSE FALSE
3: 2 3 5 FALSE FALSE FALSE FALSE FALSE FALSE
4: 5 5 2 FALSE FALSE FALSE FALSE FALSE FALSE
It seems to me that the code would work if I replaced .SD
with a reference to the current row (instead of the whole table), but I don't know how to do that.
What is an efficient way of adding these columns?