6

I am curious about how the function n from dplyr is programmed. When evaluating n in the dplyr env, all I get is this:

function () 
{
    stop("This function should not be called directly")
}
<environment: namespace:dplyr>

Maybe it's a silly question but, where is it defined? How come it works when called inside some other functions? In which env is it hidden?

Thanks for your help

Jaap
  • 81,064
  • 34
  • 182
  • 193
Gere Caste
  • 130
  • 5
  • 1
    Good question. [The actual n() function](https://github.com/hadley/dplyr/blob/6153e136fa9397e88478fa6270d9d1f02eb5153e/R/manip.r#L337-L339) is easy to find, but it does not explain why `n()` works in `mutate`, `filter` and `summarise`. Could it be in the C code? – Fr. Dec 30 '16 at 13:43
  • 4
    May be [this](http://stackoverflow.com/questions/39305474/dplyrn-returns-error-this-function-should-not-be-called-directly) helps – akrun Dec 30 '16 at 13:44

1 Answers1

6

As far as I understand, dplyr uses hybrid evaluation. That means it will evaluate some parts of the expression in C++ and others in R. n() is one of the functions that always gets handled by C++. This is why the function doesn't do anything in R directly, except for returning an error, since the function is never evaluated by R.

The relevant C++ code can be found on github.

Axeman
  • 32,068
  • 8
  • 81
  • 94
  • nice! interesting function... Actually it's educative that hadley (or whoever) wrote this function with the error in R, otherwise I would not have learned that! – Gere Caste Dec 30 '16 at 14:22