34

During a recent investigation into setting random seeds within functions, I came across an odd situation. Consider functions f and g, each of which sets the random seed and then performs a simple randomized operation:

g <- function(size) { set.seed(1) ; runif(size) }
f <- function(x) { set.seed(2) ; x*runif(length(x)) }

Because each function sets the random seed, I would expect each function to always have the same return value given the same input. This would mean f(g(2)) should return the same thing as x <- g(2) ; f(x). To my surprise this is not the case:

f(g(2))
# [1] 0.1520975 0.3379658

x <- g(2)
f(x)
# [1] 0.04908784 0.26137017

What is going on here?

Community
  • 1
  • 1
josliber
  • 43,891
  • 12
  • 98
  • 133

2 Answers2

37

This is an example of the double-slit R experiment. When x is observed, it acts as a particle; when unobserved it acts as a wave. Behold

g <- function(size) { set.seed(1) ; runif(size) }
f <- function(x) {set.seed(2) ; x*runif(length(x)) }
f2 <- function(x) {print(x); set.seed(2) ; x*runif(length(x)) }

f(g(2))
# [1] 0.1520975 0.3379658

x <- g(2)
f(x)
# [1] 0.04908784 0.26137017


f2(g(2))
# [1] 0.2655087 0.3721239
# [1] 0.04908784 0.26137017

x <- g(2)
f2(x)
# [1] 0.2655087 0.3721239
# [1] 0.04908784 0.26137017

I'm just josilbering you. print is forcing x. You can do that explicitly

f <- function(x) {force(x); set.seed(2) ; x*runif(length(x)) }
x <- g(2)
f(x)
# [1] 0.04908784 0.26137017

But not this

f(force(g(2)))
# [1] 0.1520975 0.3379658
rawr
  • 20,481
  • 4
  • 44
  • 78
24

The x argument of your f() function is only evaluated at the moment that it is actually used inside the function. This means that the set.seed(2) is evaluated before the execution of the g() function when you try to compute f(g(2)).

> f(g(2))
[1] 0.1520975 0.3379658

is basically equivalent to:

> set.seed(2)
> set.seed(1)
> result <- runif(2)
> result*runif(length(result))
[1] 0.1520975 0.3379658
Jellen Vermeir
  • 1,720
  • 10
  • 10