3

I went through a lot of questions that are similar to mine but only addressed one part of my problem. I am using dplyr with standard evaluation to accommodate variable names. This works fine for filter_ and group_by_ in a pipe. However, for summarize I cannot have a variable name for the metric I'm summing. An example will make it clear.

library(dplyr)
library(lazyeval)

# create data
a <- data.frame(
  x = c(2010, 2010, 2011, 2011, 2011),
  y_zm = c(rep(10, 5)),
  y_r2 = c(rep(20, 5)))

# define variable names
tag <- "2011"
metric <- "y"
run1 <- "zm"
run2 <- "r2"

# working example for a pipe with fixed variable name
a %>%
  filter_(~x == tag) %>%
  group_by_(tag) %>%
  summarise_(variable_name = interp(~sum(var, na.rm = T), 
                                    var = as.name(paste0(metric,"_",run1))))

# non-working example of what I want to do
a %>%
  filter_(~x == tag) %>%
  group_by_(tag) %>%
  summarise_(as.name(paste0(metric,"_",run1)) = 
               interp(~sum(var, na.rm = T), 
                      var = as.name(paste0(metric,"_",run1))))

I tried a lot of different things involving as.name() or interp() but nothing seems to work.

Frank
  • 66,179
  • 8
  • 96
  • 180
Triamus
  • 2,415
  • 5
  • 27
  • 37
  • Maybe just add a `rename_` step a la [this answer](http://stackoverflow.com/a/30383036/2461552)? It could look something like `rename_(.dots = setNames("variable_name", paste0(metric,"_",run1)))`. – aosmith Aug 19 '15 at 16:26

1 Answers1

4

After poring over the NSE vignette for awhile and poking at things, I found you can use setNames within summarise_ if you use the .dots argument and put the interp work in a list.

a %>%
    filter_(~x == tag) %>%
    group_by_(tag) %>%
    summarise_(.dots = setNames(list(interp(~sum(var, na.rm = TRUE),
                                            var = as.name(paste0(metric,"_",run1)))), 
                                                            paste0(metric,"_",run1)))

Source: local data frame [1 x 2]

  2011 y_zm
1 2011   30

You could also add a rename_ step to do the same thing. I could see this being less ideal, as it relies on knowing the name you used in summarise_. But if you always use the same name, like variable_name, this does seem like a viable alternative for some situations.

a %>%
    filter_(~x == tag) %>%
    group_by_(tag) %>%
    summarise_(variable_name = interp(~sum(var, na.rm = T), 
                                         var = as.name(paste0(metric,"_",run1)))) %>%
    rename_(.dots = setNames("variable_name", paste0(metric,"_",run1)))

Source: local data frame [1 x 2]

  2011 y_zm
1 2011   30
aosmith
  • 34,856
  • 9
  • 84
  • 118
  • I'm not fully understanding the answer yet but it works like a charm. thanks a lot! I will have to dive deep into the dplyr documentation to grasp it. – Triamus Aug 20 '15 at 07:46
  • 3
    is it just me or dplyr SE is just painfull – MySchizoBuddy Jun 07 '16 at 12:01
  • @MySchizoBuddy: I have the same impression. Things that are easy when you hardcode the name become ridiciously convoluted if you don't... – Thomas Jan 09 '18 at 08:12