1

While using n() within summarise_at(), I obtain this error:

Error: n() should only be called in a data context
Call `rlang::last_error()` to see a backtrace

Others have suggested this could be a masking issue of dplyr with plyr, two solutions are:

  1. Replace summarise_at() with `dplyr::summarise_at()'
  2. Call detach("package:plyr", unload=TRUE)

Neither have removed this error and I'm curious to understand what is causing it. Here is a reproducible example which should result in the same error:

Df <- data.frame(
  Condition = c(rep("No", 20), rep("Yes",20)),
  Height = c(rep(1,10),rep(2,10),rep(1,10),rep(2,10)),
  Weight = c(rep(10,5),rep(20,5),rep(30,5), rep(40,5))
)

x <- c("Height","Weight")

Df %>% 
  group_by(Condition) %>% 
  summarise_at(vars(one_of(x)), c(mean = mean, sd = sd, count = n()))

Note: If you remove count = n() the code runs without any issue

Ali
  • 1,048
  • 8
  • 19

1 Answers1

7

I believe it is because n() works on the data source itself within mutate, filter, or summarize, so isn't a vectorized function. Just use length instead as the vectorized version.

Df %>% 
  group_by(Condition) %>% 
  summarise_at(vars(one_of(x)), c(mean = mean, sd = sd, count = length))

If you want to only have one count column, then:

Df %>% 
  group_by(Condition) %>%
  mutate(count = n()) %>%
  group_by(Condition, count) %>%
  summarise_at(vars(one_of(x)), c(mean = mean, sd = sd))
caldwellst
  • 5,719
  • 6
  • 22
  • Many thanks for the explanation, as a side question, this results in two columns, 'Height_count' and 'Weight_count', is there any way to obtain just one column called 'Count'. – Ali Sep 27 '19 at 11:19
  • 1
    A bit hacky, but don't think there is a way to do it in `summarize`, so we can just add it in a `mutate` call before and make it a group to keep it in our result. Will edit my response above to add. – caldwellst Sep 27 '19 at 11:53