0

I would like to write a function that take for input a data-frame and one of its factor variable, and that give for output a data-frame with the different levels of this factor and the number of occurence for each level.

Here is a code that do that :

df <- data.frame(ID = sample(c("a", "b", "c", "d"), 20, rep=TRUE))
df %>% group_by(ID) %>% summarise(no_rows = length(ID)) %>% arrange(desc(no_rows))

But I don't know how to put that in a function since the name of the variable (ID) is not inside quote in the second line.

f <- function(df, var){
   df %>% group_by(var) %>% summarise(no_rows = length(var)) %>% arrange(desc(no_rows))
}

f(df, ID) do not work. And I can't write f(df, "ID").

armandfavrot
  • 169
  • 9

2 Answers2

2

Using dplyr::count and non-standard evaluation (see this SO post or tidyverse documentation) combined with argument sort = TRUE:

library(dplyr)
f <- function(df, var) df %>% count({{ var }}, name = "no_rows", sort = T)

set.seed(1) # using seed for reproducibility
df <- data.frame(ID = sample(c("a", "b", "c", "d"), 20, rep=TRUE))

f(df, ID)
  ID no_rows
1  b       7
2  a       6
3  c       6
4  d       1
Donald Seinen
  • 4,179
  • 5
  • 15
  • 40
  • Good to point out that `{{ var }}` does in one fell swoop what `var <- rlang::enquo(var)` and `!!var` do in two steps. – Greg Dec 07 '21 at 16:48
  • I have another question related to this : how would I use var like it was actually var. Here is an example : which(colnames(df) == var) won't work, I tried which(colnames(df) == !!var) but it still doesn't work. It needs to be seen as which(colnames(df)) == "var"). How to manage this ? – armandfavrot Dec 08 '21 at 13:47
  • @armandfavrot in what context are you trying this? within a function? within a `tidyverse` pipeline? `!!` and `{{` should only be used within `tidyverse` verbs, which `colnames` is not. Have a look at `deparse(substitute(var))` and [this book chapter](http://adv-r.had.co.nz/Computing-on-the-language.html). If further problems come up, consider opening a separate question, as this is only loosely related. – Donald Seinen Dec 08 '21 at 15:07
  • @DonaldSeinen I was trying this within a function, the same function f (I wanted to add some line referring to "var" instead of var, and deparse(substitute(var)) was what I was looking for, thanks again ! – armandfavrot Dec 08 '21 at 15:28
1
f <- function(df,var){
   var <- enquo(var)
   df %>% group_by(!!var) %>% summarise(no_rows = length(!!var)) %>% arrange(desc(no_rows))
}

update function in this way,

f(df, ID)

output;

  ID    no_rows
  <chr>   <int>
1 a           6
2 d           6
3 b           4
4 c           4
Samet Sökel
  • 2,515
  • 6
  • 21
  • I have another question related to this : how would I use var like it was actually var. Here is an example : which(colnames(df) == var) won't work, I tried which(colnames(df) == !!var) but it still doesn't work. It needs to be seen as which(colnames(df)) == "var"). How to manage this ? – armandfavrot Dec 08 '21 at 13:52
  • I think it's because `colnames()` function returns a character vector. – Samet Sökel Dec 09 '21 at 07:47