14

I know it's good practice not to use names from the global namespace when naming variables, but what happens when you do this accidentally?

I thought I would lose the previous object but R seems to have some trickery under the hood:

print(sd)
#> function (x, na.rm = FALSE) 
#> sqrt(var(if (is.vector(x) || is.factor(x)) x else as.double(x), 
#>     na.rm = na.rm))
#> <bytecode: 0x0000000017e687b8>
#> <environment: namespace:stats>

sd <- 12.2

print(sd)
#> [1] 12.2

sd(1:10)
#> [1] 3.02765

So now R knows there is both a length one double vector called sd and a stats function sd() in the global namespace?

Or when I call sd(1:10) the interpreter automatically expands this to sd.default()? But how does R know to look for a default method on sd as it's now a vector? So functions and variables stored in different places in memory can be referenced by the same name?

obviously_a_user_defined_variable <- 257
obviously_a_user_defined_variable(1:10)
#> Error in obviously_a_user_defined_variable(1:10): could not find 
#  function "obviously_a_user_defined_variable"
smci
  • 32,567
  • 20
  • 113
  • 146
rdh
  • 1,035
  • 7
  • 11
  • Simple answer: `sd(1:10)` is a call to the function `sd`. So R looks for a function called `sd()`, which it finds in the pre-loaded stats package. – Rich Scriven Nov 19 '17 at 21:38
  • The term (and tag) you want to use is **'shadowing'**, as in 'a variable of the same name is shadowing the function'. Strictly, the function's name hasn't been 'reassigned'. – smci Nov 20 '17 at 03:36

2 Answers2

13

R has separate namespaces for functions and variables. Depending on the context in which a name occurs, R will look up the name in one namespace or in the other.

For instance, the expression sd(1:10) is a call and the first element in a call must be the name of a function. Therefore, in this case, R will look for a function named sd.

On the other hand, the expression sd is not a call but a name, which could be either the name of a variable or the name of a function. In this case R will look first for the first object in the search path named sd regardless of whether it's a function or another type of object.

Ernest A
  • 7,526
  • 8
  • 34
  • 40
  • Nice reference, I think I prefer the Lisp-1 scenario, where a name points to one thing and one thing only. Less ambiguity that way. But perhaps there are benefits to R's approach – rdh Nov 19 '17 at 21:44
  • R is highly flexible. You can do absolutely everything even awful tricks it always works. It is its strength and its weakness. I do agree with you as a developer but to be pragmatic the fact that R always works whatever the bullshit your write is useful and convenient. – JRR Nov 19 '17 at 22:31
  • 3
    The last part of this is not quite right; without parentheses, R looks for the first thing in the search path called `sd`, regardless of whether it's a function of an object, e.g. `mtcars <- function() 'foo'; mtcars` – alistaire Nov 19 '17 at 23:38
  • Does that mean that what namespace a name is assigned in depends on a runtime check, where `a <- b` assigns to something different depending on what the `b` lookup finds? – user2357112 Nov 20 '17 at 06:06
  • 2
    This isn't exactly correct I think, e.g. running `a <- mean; a <- 5; a(1:10)` will give an error, because the function `a` has been overwritten with a variable. Yes, when using the parentheses `a()` R will look for a function called `a` on the search path and avoid other objects, but I don't think there are separate namespaces for functions and variables. – Axeman Nov 20 '17 at 08:07
3

sd belongs in the stats environment not in globalenv. Calling sd() R looks for the function sd. It is not in globalenv so it looks into the other environments until it finds a function sd

This is called lexical scoping and it is explained in Hadley's books http://adv-r.had.co.nz/. Likely in this chapter http://adv-r.had.co.nz/Environments.html or this one http://adv-r.had.co.nz/Functions.html

JRR
  • 3,024
  • 2
  • 13
  • 37
  • 1
    The `search()` path explains where it finds them (and why it finds `sd` before `sd()`), but not why it keeps looking for `sd()`, which is presumably because the parentheses mean it gets interpreted as a function call instead of an object. – alistaire Nov 19 '17 at 23:34