Regarding tidyr::complete
vs base::expand.grid
, performance might also be a factor.
According to the benchmark below complete
is much slower, though difference decreases with input size.
df <- data.frame(a= 1:10,b= 1:10)
# microbenchmark(complete(df,a,b), expand.grid(df))
# Unit: microseconds
# expr min lq mean median uq max neval
# complete(df, a, b) 15345.348 16065.27 17947.2132 16609.512 17351.317 46415.772 100
# expand.grid(df) 129.194 144.74 174.8799 194.395 201.337 256.577 100
df <- data.frame(a= 1:100,b= 1:100)
# microbenchmark(complete(df,a,b), expand.grid(df))
# Unit: microseconds
# expr min lq mean median uq max neval
# complete(df, a, b) 15992.523 16380.1030 17743.4860 16611.4730 16998.149 26622.31 100
# expand.grid(df) 323.588 340.4925 376.6481 383.6575 397.844 665.89 100
df <- data.frame(a= 1:1000,b= 1:1000)
microbenchmark(complete(df,a,b), expand.grid(df))
# Unit: milliseconds
# expr min lq mean median uq max neval
# complete(df, a, b) 86.58981 88.49813 98.73944 93.62617 98.83436 157.40141 100
# expand.grid(df) 18.99899 19.40211 21.83331 21.20161 23.71123 33.19729 100