2

Can someone tell me how to pass a vector with the argument's names to a function using dplyr ?

library("dplyr", quietly = TRUE, warn.conflicts = FALSE) # version 0.8.0.1

# Does not work
iris %>% rowwise() %>%  mutate(v1 = mean( as.name(names(iris)[-5]) ) )
iris %>% rowwise() %>%  mutate(v1 = mean( !!(names(iris)[-5]) ) )
iris %>% rowwise() %>%  mutate(v1 = mean( enquo(names(iris)[-5]) ) )
iris %>% rowwise() %>%  
mutate(v1 = mean( c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width")  ) )

# This works and is the intended result
iris %>% rowwise() %>%  
mutate(v1 = mean( c(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width )  ) )

The point is to have the function (mean or any function) works with names(iris)[-5] or a vector with the names of the variables.

I have looked here without success : dplyr mutate_each_ standard evaluation ; dplyr: Standard evaluation and enquo()

My session informations :

R version 3.5.3 (2019-03-11)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17763)

Matrix products: default

locale:
[1] LC_COLLATE=French_France.1252  LC_CTYPE=French_France.1252   
[3] LC_MONETARY=French_France.1252 LC_NUMERIC=C                  
[5] LC_TIME=French_France.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ggplot2_3.1.0   visdat_0.5.3    lubridate_1.7.4 naniar_0.4.2   
[5] dplyr_0.8.0.1  

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.1       rstudioapi_0.10  magrittr_1.5     tidyselect_0.2.5
 [5] munsell_0.5.0    colorspace_1.4-0 R6_2.4.0         rlang_0.3.4     
 [9] fansi_0.4.0      stringr_1.4.0    plyr_1.8.4       tools_3.5.3     
[13] grid_3.5.3       packrat_0.5.0    gtable_0.2.0     utf8_1.1.4      
[17] cli_1.1.0        withr_2.1.2      digest_0.6.18    lazyeval_0.2.2  
[21] assertthat_0.2.0 tibble_2.1.1     crayon_1.3.4     tidyr_0.8.3     
[25] purrr_0.3.2      glue_1.3.1       labeling_0.3     stringi_1.4.3   
[29] compiler_3.5.3   pillar_1.3.1     scales_1.0.0     pkgconfig_2.0.2 

Thanks in advance !

cbo
  • 1,664
  • 1
  • 12
  • 27

3 Answers3

5

Use map2_dbl

library(tidyverse)
iris %>% mutate(v1 = map2_dbl(Sepal.Length, Sepal.Width, ~mean(c(.x, .y)))) %>% head

#  Sepal.Length Sepal.Width Petal.Length Petal.Width Species   v1
#1          5.1         3.5          1.4         0.2  setosa 4.30
#2          4.9         3.0          1.4         0.2  setosa 3.95
#3          4.7         3.2          1.3         0.2  setosa 3.95
#4          4.6         3.1          1.5         0.2  setosa 3.85
#5          5.0         3.6          1.4         0.2  setosa 4.30
#6          5.4         3.9          1.7         0.4  setosa 4.65

Or if you want to take mean of certain columns.

cols <- c("Sepal.Length", "Sepal.Width")

iris %>% mutate(v1 = rowMeans(.[cols])) %>% head
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • 1
    And `map2_dbl` is preferable over `rowwise`, as the latter isn't being ["actively developed"](https://github.com/tidyverse/dplyr/issues/3144#issuecomment-341530896) anymore. – Dan Jun 21 '19 at 13:54
  • Thanks for your answer, it is really close. How to have it works with names(iris)[1:4] ? `iris %>% mutate(v1 = purrr::map_dbl(.x = as.list(names(iris)[1:4]), .f~mean(.x))) %>% head` gives an error. – cbo Jun 21 '19 at 13:56
  • 1
    @cbo I updated an option with `rowMeans` where it can take names or you can also use `rlang::sym` for two values. `iris %>% mutate(v1 = map2_dbl(!!sym(cols[1]), !!sym(cols[2]), ~mean(c(.x, .y))))` – Ronak Shah Jun 21 '19 at 14:06
3

We can use rowMeans in base R

cols <-  c("Sepal.Length", "Sepal.Width")
iris$v1 <- rowMeans(iris[cols])

Or in tidyverse

library(tidyverse)
iris %>%
    mutate(v1 = select(., cols)  %>% reduce(`+`)/length(cols)) %>%
    head
#  Sepal.Length Sepal.Width Petal.Length Petal.Width Species   v1
#1          5.1         3.5          1.4         0.2  setosa 4.30
#2          4.9         3.0          1.4         0.2  setosa 3.95
#3          4.7         3.2          1.3         0.2  setosa 3.95
#4          4.6         3.1          1.5         0.2  setosa 3.85
#5          5.0         3.6          1.4         0.2  setosa 4.30
#6          5.4         3.9          1.7         0.4  setosa 4.65

Or another option is pmap (should work when there are more than two column as well)

iris %>%
      mutate(v1 = pmap_dbl(.[cols], ~ mean(c(...))))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • `pmap` is definitely the tidyr answer, thanks ! (Reduce is nice too but not sure it works with custom stats). – cbo Jun 24 '19 at 14:53
  • @cbo Yes, it can work with a custom function. Make sure to check the arguments passed – akrun Jun 24 '19 at 14:54
1

Thank you @Ronak Shah and @akrun for your answers. My question was maybe not well formulated right from start and it is pmap that was looked for :

cols <- names(iris)[-5]

library(dplyr, quietly = TRUE, warn.conflicts = FALSE)

iris %>% mutate(v1 = rowMeans(.[cols])) %>% head # ok with mean per rows
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species    v1
#> 1          5.1         3.5          1.4         0.2  setosa 2.550
#> 2          4.9         3.0          1.4         0.2  setosa 2.375
#> 3          4.7         3.2          1.3         0.2  setosa 2.350
#> 4          4.6         3.1          1.5         0.2  setosa 2.350
#> 5          5.0         3.6          1.4         0.2  setosa 2.550
#> 6          5.4         3.9          1.7         0.4  setosa 2.850

# Creating a custom stat function
set.seed(123)
w0 <- rnorm(n = 10)
mystat <- function(x, w = w0[1:length(x)]) sum(x*w)/length(x)

iris[1, cols] %>% mystat # test value
#> [1] -0.3669384

# Tests
iris %>% mutate(v1 = mystat(.[cols])) %>% head # ko
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species       v1
#> 1          5.1         3.5          1.4         0.2  setosa 109.1179
#> 2          4.9         3.0          1.4         0.2  setosa 109.1179
#> 3          4.7         3.2          1.3         0.2  setosa 109.1179
#> 4          4.6         3.1          1.5         0.2  setosa 109.1179
#> 5          5.0         3.6          1.4         0.2  setosa 109.1179
#> 6          5.4         3.9          1.7         0.4  setosa 109.1179

library(purrr, quietly = TRUE, warn.conflicts = FALSE)
iris %>% mutate(v1 = map_dbl(list(.[cols]), mystat)) %>% head # ko
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species       v1
#> 1          5.1         3.5          1.4         0.2  setosa 109.1179
#> 2          4.9         3.0          1.4         0.2  setosa 109.1179
#> 3          4.7         3.2          1.3         0.2  setosa 109.1179
#> 4          4.6         3.1          1.5         0.2  setosa 109.1179
#> 5          5.0         3.6          1.4         0.2  setosa 109.1179
#> 6          5.4         3.9          1.7         0.4  setosa 109.1179

iris %>% mutate(v1 = pmap_dbl(.[cols], ~ mystat(c(...)))) %>% head # OK mean
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species         v1
#> 1          5.1         3.5          1.4         0.2  setosa -0.3669384
#> 2          4.9         3.0          1.4         0.2  setosa -0.3101425
#> 3          4.7         3.2          1.3         0.2  setosa -0.3325953
#> 4          4.6         3.1          1.5         0.2  setosa -0.2348935
#> 5          5.0         3.6          1.4         0.2  setosa -0.3586810
#> 6          5.4         3.9          1.7         0.4  setosa -0.3115633
cbo
  • 1,664
  • 1
  • 12
  • 27