Function takes an argument literally rather than the value of the argument

Question

I tried to create a function that would return me x largest MOLECULES based on how many unique PATIENT_ID each of them has, in descending order. That from a certain date until the last.

data <- data.frame(PATIENT_ID = c(1,1,2,2), dateM = c(ymd("2020-01-05","2020-01-06","2020-05-06","2019-12-15")), MOLECULES = c("mol1", "mol1", "mol1", "mol2"))


topx <- function(data, datefrom, var ,  x = 5){
  data %>%
  subset(dateM >= datefrom) %>%
  group_by(var) %>%
  summarize(pat = length(unique(PATIENT_ID))) %>%
  arrange(-pat) %>% 
  head(x) %>% 
  select(1)
}

topx(data = data, datefrom = "2016-04", var = MOLECULES, x = 2)

The wanted result in this case would be would be:

c("mol1","mol2")

However, it takes var as text and doesnt parse the MOLECULES in and tells me that.

 Error: Must group by variables found in `.data`.
* Column `var` is not found.

Heads up, there’s also the function [`slice_max`](https://dplyr.tidyverse.org/reference/slice.html) in ‘dplyr’, which does something very similar; that said, I don’t think using it here would help. Apart from this, I recommend not mixing ‘dplyr’ functions with the base R equivalents. That is, use `filter` instead of `subset`. `filter` is more robust, provides better error messages when you do something wrong, and also works with interpolated variables via `{{…}}`. `subset` would *not* work with it. In principle the same is true with `head` vs `slice_head`, but the argument is less strong here. — Konrad Rudolph, Jan 14 '21 at 13:23

score 2 · Accepted Answer · answered Jan 14 '21 at 13:09

Cool function. There are special rules and operations when programming with dplyr. See more here. Specifically, you need the {{}} operator.


library(tidyverse)
library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union

data <- data.frame(PATIENT_ID = c(1,1,2,2), dateM = c(ymd("2020-01-05","2020-01-06","2020-05-06","2019-12-15")), MOLECULES = c("mol1", "mol1", "mol1", "mol2"))

topx <- function(data, datefrom, var ,  x = 5){
  data %>%
    subset(dateM >= datefrom) %>%
    group_by({{var}}) %>%
    summarize(pat = length(unique(PATIENT_ID))) %>%
    arrange(-pat) %>% 
    head(x) %>% 
    select(1)
}

topx(data = data, datefrom = "2016-04-01", var = MOLECULES, x = 2) 
#> `summarise()` ungrouping output (override with `.groups` argument)
#> # A tibble: 2 x 1
#>   MOLECULES
#>   <chr>    
#> 1 mol1     
#> 2 mol2

^{Created on 2021-01-14 by the reprex package (v0.3.0)}

Follow up question, when I use this function in its intended use, i.e. summarized based on if some values of MOLECULES are in top x, it throws me ` `summarise()` regrouping output by 'dateM' (override with `.groups` argument) `summarise()` ungrouping output (override with `.groups` argument) ```, second message repeated for like 48 times. This persists even when I add `as.factor` or `as.character` to the end of the function — Jirka Čep, Jan 14 '21 at 15:52
See answer here: https://stackoverflow.com/questions/62140483/how-to-interpret-dplyr-message-summarise-regrouping-output-by-x-override — Magnus Nordmo, Jan 14 '21 at 20:54

latlio · Answer 2 · 2021-01-14T13:27:03.987

0

I believe this is a quasi quotation issue. !! does a one-to-one evaluation of an expression. For more information see https://adv-r.hadley.nz/quasiquotation.html

Try:

topx <- function(data, datefrom, var ,  x = 5){
  var <- enquo(var)
  data %>%
  subset(dateM >= datefrom) %>%
  group_by(!!var) %>%
  summarize(pat = length(unique(PATIENT_ID))) %>%
  arrange(-pat) %>% 
  head(x) %>% 
  select(1)
}

edited Jan 14 '21 at 13:27

answered Jan 14 '21 at 12:57

latlio

1,567
7
15

That won’t work, you also need to `enquo` the variable — and `{{var}}` does both enquoting and unquoting/expanding for you. – Konrad Rudolph Jan 14 '21 at 13:15
Indeed this does not work, however, it is a useful but of information that I had no idea is a thing – Jirka Čep Jan 14 '21 at 13:17
oh sorry forgot to add the `enquo`! – latlio Jan 14 '21 at 13:26

Function takes an argument literally rather than the value of the argument

2 Answers2