I want to perform several operations intertwining dtplyr
and data.table
code. My question is whether, having loaded dtplyr
, I can apply dplyr
verbs to a data.table
object and get optimized data.table
code as I would with a lazy_dt
.
I here provide some examples and ask: would dtplyr
translate to data.table
code here? Or is simply dplyr
working?
# Setup for all chunks:
library(dplyr)
library(data.table)
library(dtplyr)
a) setDT
dataframe # class data.frame
setDT(dataframe)
dataframe %>%
group_by(id) %>%
mutate(rows_per_group = n())
b) data.table object
dt <- as.data.table(dataframe) # or dt <- data.table::fread(filepath)
dt %>%
group_by(id) %>%
mutate(rows_per_group = n())
Also, if all of them make dtplyr
work. What is the most efficient option between a), b) and c) using lazy_dt(dataframe)
?