0

Question

I have a tibble with two columns of parameters for different models, and I was hoping to find a way to generate a list column of lists of the first 80 outputs for those models. The sample here is of 100 models, but in the future, I'll be looking at over 10,000, which is why I'd like their outputs to be grouped with them in a row.

What I've Tried

library(tidyverse)
new.tibble <- tibble(a = rep(1:5, 20))
new.tibble <- new.tibble %>% add_column(b = 1:100)

my.vector <- c(1:80)

# What I want
new.tibble <- new.tibble %>% mutate(c = lapply(my.vector, function(x) {a ^ (x / b)}))

And I've tried this with the different apply functions (sapply, lapply, apply, etc.), and it doesn't seem to work. When I run that last line, I get the following:

> new.tibble <- new.tibble %>% mutate(c = lapply(my.vector, function(x) {a ^ (x / b)}))
Error: Column `c` must be length 100 (the number of rows) or one, not 80

Which leads me to believe that the mutate is only generating 80 outputs in total, not 80 outputs per row and storing those 80 outputs in a list in that row as I'd like. I've tried having my tibble behave row-wise to see if that would help:

> row.tibble <- rowwise(new.tibble)
> row.tibble <- row.tibble %>% mutate(c = lapply(my.vector, function(x){a ^ (x / b)}))
Error: Column `c` must be length 1 (the group size), not 80

And it did not. I know it would not be hard to set up a while loop and just have that generate the different outputs as their own separate lists, but with over 10,000 lists, each corresponding to a model in the rows, I'd like to think that a list-column would be the best way to organize the outputs. I've also tried using as.list to force a returned output of a list, but that didn't work as I intended:

> row.tibble <- row.tibble %>% mutate(c = as.list(lapply(my.vector, FUN = function(x){a ^ (x / b)})))
Error: Column `c` must be length 1 (the group size), not 80
> new.tibble <- new.tibble %>% mutate(c = as.list(lapply(my.vector, FUN = function(x){a ^ (x / b)})))
Error: Column `c` must be length 100 (the number of rows) or one, not 80

I tried eschewing the lapply and trying to directly get my desired outputs, and that didn't work:

> new.tibble %>% mutate(c = as.list(a ^ (my.vector / b)))
# A tibble: 100 x 3
       a     b c        
   <int> <int> <list>   
 1     1     1 <dbl [1]>
 2     2     2 <dbl [1]>
 3     3     3 <dbl [1]>
 4     4     4 <dbl [1]>
 5     5     5 <dbl [1]>
 6     1     6 <dbl [1]>
 7     2     7 <dbl [1]>
 8     3     8 <dbl [1]>
 9     4     9 <dbl [1]>
10     5    10 <dbl [1]>
# ... with 90 more rows
Warning message:
In my.vector/b :
  longer object length is not a multiple of shorter object length

> row.tibble %>% mutate(c = as.list(a ^ (my.vector / b)))
Error: Column `c` must be length 1 (the group size), not 80

Additional Information

> sessionInfo()
R version 4.0.0 (2020-04-24)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] forcats_0.5.0   stringr_1.4.0   dplyr_0.8.5     purrr_0.3.4     readr_1.3.1     tidyr_1.0.3     tibble_3.0.1    ggplot2_3.3.0   tidyverse_1.3.0

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.4.6     cellranger_1.1.0 pillar_1.4.4     compiler_4.0.0   dbplyr_1.4.3     tools_4.0.0      jsonlite_1.6.1   lubridate_1.7.8  lifecycle_0.2.0 
[10] nlme_3.1-147     gtable_0.3.0     lattice_0.20-41  pkgconfig_2.0.3  rlang_0.4.6      reprex_0.3.0     cli_2.0.2        DBI_1.1.0        rstudioapi_0.11 
[19] haven_2.2.0      withr_2.2.0      xml2_1.3.2       httr_1.4.1       fs_1.4.1         generics_0.0.2   vctrs_0.2.4      hms_0.5.3        grid_4.0.0      
[28] tidyselect_1.0.0 glue_1.4.0       R6_2.4.1         fansi_0.4.1      readxl_1.3.1     modelr_0.1.7     magrittr_1.5     backports_1.1.6  scales_1.1.1    
[37] ellipsis_0.3.0   rvest_0.3.5      assertthat_0.2.1 colorspace_1.4-1 utf8_1.1.4       stringi_1.4.6    munsell_0.5.0    broom_0.5.6      crayon_1.3.4
  • Your first premise of using `lapply(my.vector, ...)` is flawed from the start: `mutate` (and most dplyr verbs) expect the return to be length 1 or the same length as the number of rows; the frame at that point has 100 rows, so ... your length of 80 is wrong. If you start with a frame of 100 rows, what do you suppose happens to the other 20 rows? To where do your list of 80 correspond? – r2evans Jun 10 '20 at 18:23
  • @r2evans Good to know about the `mutate`! Would I be better off using `add_column`? I tried `> row.tibble %>% add_column(c = as.list(a ^ (my.vector / b)))` and got `Error in as.list(a^(my.vector/b)) : object 'a' not found` As well as a similar error when I tried with new.tibble. – Andrew Flash Jun 10 '20 at 18:38

1 Answers1

1

Are you looking for this kind of result?

new.tibble <- new.tibble %>% 
  mutate(c = map2(.x = a, .y = b, .f = ~.x^(my.vector/.y) ))

Output:

head(new.tibble)
# A tibble: 6 x 3
      a     b c         
  <int> <int> <list>    
1     1     1 <dbl [80]>
2     2     2 <dbl [80]>
3     3     3 <dbl [80]>
4     4     4 <dbl [80]>
5     5     5 <dbl [80]>
6     1     6 <dbl [80]>
Valeri Voev
  • 1,982
  • 9
  • 25