We stumbled upon some strange behaviour trying to expand a data.table. The following code works alright:
dt <- data.table(var1=1:2e3, var2=1:2e3, freq=1:2e3)
system.time(dt.expanded <- dt[ ,list(freq=rep(1,freq)),by=c("var1","var2")])
## user system elapsed
## 0.05 0.01 0.06
But using the following data.table
set.seed(1)
dt <- data.table(var1=sample(letters,1000,replace=T),var2=sample(LETTERS,1000,replace=T),freq=sample(1:10,1000,replace=T))
with the same code gives
Error in rep(1, freq) : invalid 'times' argument
My question
Might this be a bug in data.table
?
(I got the syntax of the this example from R Machine Learning Essentials)
Edit
So the problem really seems to be with rep
and not with data.table
. The help page for rep
says for the parameter times
:
A integer vector giving the (non-negative) number of times to repeat each element if of length length(x), or to repeat the whole vector if of length 1.
The second data.table
creates times
of different length than x
which throws the error.