Using dplyr::select semantics within a dplyr::mutate function

Question

What I'm trying to do here is bring in dplyr::select() semantics into a function supplied to dplyr::mutate(). Below is a minimal example.

dat <- tibble(class = rep(c("A", "B"), each = 10),
              x = sample(100, 20),
              y = sample(100, 20),
              z = sample(100, 20))

.reorder_rows <- function(...) {
    x <- list(...)
    y <- as.matrix(do.call("cbind", x))
    h <- hclust(dist(y))
    return(h$order)
}

dat %>%
    group_by(class) %>%
    mutate(h_order = .reorder_rows(x, y, z))

##    class     x     y     z h_order
##   <chr> <int> <int> <int>   <int>
## 1      A    85    17     5       1
## 2      A    67    24    35       5
## ...
## 18     B    76     7    94       9
## 19     B    65    39    85       8
## 20     B    49    11   100      10
## 
## Note: function applied across each group, A and B

What I would like to do is something along the lines of:

dat %>%
    group_by(class) %>%
    mutate(h_order = .reorder_rows(-class))

The reason this is important is that when dat has many more variables, I need to be able to exclude the grouping/specific variables from the function's calculation.

I'm not sure how this would be implemented, but somehow using select semantics within the .reorder_rows function might be one way to tackle this problem.

Definitely was going more for what you asked in your question, in terms of being able to use select helpers directly inside a function (I think it has something to do with the environment). — robert.amezquita, May 18 '17 at 12:52

score 3 · Accepted Answer · answered May 17 '17 at 19:09

3

For this particular approach, you should probably nest and unnest (using tidyr) by class rather than grouping by it:

library(tidyr)
library(purrr)

dat %>%
  nest(-class) %>%
  mutate(h_order = map(data, .reorder_rows)) %>%
  unnest()

Incidentally, notice that while this works with your function you could also write a shorter version that takes the data frame directly:

.reorder_rows <- function(x) {
  h <- hclust(dist(as.matrix(x)))
  return(h$order)
}

answered May 17 '17 at 19:09

David Robinson

77,383
16
167
187

Definitely, in this case I think the tidyr +map approach is the one to use! [although I still am curious as to how select helpers could be brought into custom functions, but probably could be discussed in another question]. Thanks! – robert.amezquita May 18 '17 at 12:47

Using dplyr::select semantics within a dplyr::mutate function

1 Answers1