1

What I'm trying to do here is bring in dplyr::select() semantics into a function supplied to dplyr::mutate(). Below is a minimal example.

dat <- tibble(class = rep(c("A", "B"), each = 10),
              x = sample(100, 20),
              y = sample(100, 20),
              z = sample(100, 20))

.reorder_rows <- function(...) {
    x <- list(...)
    y <- as.matrix(do.call("cbind", x))
    h <- hclust(dist(y))
    return(h$order)
}

dat %>%
    group_by(class) %>%
    mutate(h_order = .reorder_rows(x, y, z))

##    class     x     y     z h_order
##   <chr> <int> <int> <int>   <int>
## 1      A    85    17     5       1
## 2      A    67    24    35       5
## ...
## 18     B    76     7    94       9
## 19     B    65    39    85       8
## 20     B    49    11   100      10
## 
## Note: function applied across each group, A and B

What I would like to do is something along the lines of:

dat %>%
    group_by(class) %>%
    mutate(h_order = .reorder_rows(-class))

The reason this is important is that when dat has many more variables, I need to be able to exclude the grouping/specific variables from the function's calculation.

I'm not sure how this would be implemented, but somehow using select semantics within the .reorder_rows function might be one way to tackle this problem.

  • Definitely was going more for what you asked in your question, in terms of being able to use select helpers directly inside a function (I think it has something to do with the environment). – robert.amezquita May 18 '17 at 12:52

1 Answers1

3

For this particular approach, you should probably nest and unnest (using tidyr) by class rather than grouping by it:

library(tidyr)
library(purrr)

dat %>%
  nest(-class) %>%
  mutate(h_order = map(data, .reorder_rows)) %>%
  unnest()

Incidentally, notice that while this works with your function you could also write a shorter version that takes the data frame directly:

.reorder_rows <- function(x) {
  h <- hclust(dist(as.matrix(x)))
  return(h$order)
}
David Robinson
  • 77,383
  • 16
  • 167
  • 187
  • Definitely, in this case I think the tidyr +map approach is the one to use! [although I still am curious as to how select helpers could be brought into custom functions, but probably could be discussed in another question]. Thanks! – robert.amezquita May 18 '17 at 12:47