Update - it seems that with = F
is incompatible with expressions in j
and also with (at least some) by =
situations.
Taking the scenario below and simplifying it as much as possible:
dt <- data.table(group1 = c("a", "a", "a", "b", "b", "b"),
group2 = c("x", "x", "y", "y", "z", "z"),
data = c(rep(T, 3), rep(F, 3)))
dt[
,
3,
with = F,
by = list(group1, group2)
]
data
1: TRUE
2: TRUE
3: TRUE
4: FALSE
5: FALSE
6: FALSE
>
dt[
,
data,
by = list(group1, group2)
]
group1 group2 data
1: a x TRUE
2: a x TRUE
3: a y TRUE
4: b y FALSE
5: b z FALSE
6: b z FALSE
>
The expression behavior is documented in a roundabout way in ?data.table
:
A single column name, single expresson of column names, list() of expressions of column names, an expression or function call that evaluates to list (including data.frame and data.table which are lists, too), or (when with=FALSE) a vector of names or positions to select.
I don't see any documentation of with = F
disabling by =
in the documentation, but it seems that in this case it does.
I'm having an issue where data.table either uses or ignores by =
depending on whether I use with = F
.
library(data.table)
dt <- data.table(group1 = c("a", "a", "a", "b", "b", "b"),
group2 = c("x", "x", "y", "y", "z", "z"),
data = c(rep(T, 3), rep(F, 3)))
# without with = F
dt[
as.vector(!is.na(dt[, 3, with = F])),
sum(data),
by = list(group1, group2)
]
>
group1 group2 V1
1: a x 2
2: a y 1
3: b y 0
4: b z 0
# with = F
dt[
as.vector(!is.na(dt[, 3, with = F])),
sum(3),
with = F,
by = list(group1, group2)
]
>
data
1: TRUE
2: TRUE
3: TRUE
4: FALSE
5: FALSE
6: FALSE
I've tried using a vector of numbers, and a vector of characters for by =
, neither work.
sum()
is an example function, I have the same basic issue when I don't use a function on j
.
In the end, I need to use with = F
to iterate across multiple columns of the data.table
in a for
loop.
Any suggestions?