I'm trying to speed up the creation of table with all possible combinations between two vectors. We can get this functionality from base R
when we use expand.grid()
. However, I was wondering whether we can accomplish the same result, but faster, using tools from {collapse}
package.
There has been a StackOverflow thread about this topic here. But even if we take the fastest solution provided there it is somewhat slowest in the following case. Although tidyr::expand_grid()
is speedier than base R, I still hope that utilizing collapse
package we can get faster processing times.
#library(collapse)
#library(tidyr)
library(babynames)
year <- collapse::funique(babynames$year, sort = TRUE)
names <- collapse::funique(babynames$name)
expand.grid.jc <- function(seq1,seq2) { ## from https://stackoverflow.com/a/10407457/6105259
as.data.frame(cbind(Var1 = rep.int(seq1, length(seq2)),
Var2 = rep.int(seq2, rep.int(length(seq1),length(seq2)))))
}
my_benchmarking <-
bench::mark(base = expand.grid(year, names),
jc = expand.grid.jc(year, names),
tidyr = tidyr::expand_grid(year, names), check = FALSE, iterations = 10)
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.
my_benchmarking
#> # A tibble: 3 x 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 base 965.3ms 1.06s 0.938 701MB 2.35
#> 2 jc 13.1s 13.39s 0.0747 820MB 0.120
#> 3 tidyr 541.2ms 656.71ms 1.55 316MB 1.24
Created on 2021-08-22 by the reprex package (v2.0.0)
Would be happy to learn whether this task could possibly be computed faster.