29

I am currently running R version 3.1.0 (on Ubuntu 12.04 LTS) and as both my R version and my operating system is getting rather old, I plan on updating both. However, I have a lot of simulations that rely on set.seed() and I would like them to still give me the same random numbers after updating both R and my operating system.

So my question is three-fold.

  1. Can I update R without changing which numbers are generated from each seed?
  2. Can I do the same for my operating system?
  3. If no to either 1) or 2), is there a way to change the seeds in my code in such a way that they are consistent with the olds seeds?
Phil
  • 595
  • 1
  • 5
  • 15
  • Random number generation is done using an algorithm. `set.seed()` passes the seed to it. Hence, it shouldn't depend on OS and R version. So, 1. Yes. 2. Yes. – kangaroo_cliff Nov 09 '17 at 10:52

3 Answers3

66

Cross-OS consistency: yes

If you installed R on two different operating systems without manually changing defaults or the RProfile, you should get the same results when using set.seed().

Consistency over versions of R: not necessarily

It used to be the case that set.seed() would give the same results across R versions, but that's no longer generally true thanks to a little-announced update in R 3.6.0. So you can get cross version consistency comparing results before R 3.6.0, but if you compare a post-3.6.0 use of set.seed() to a pre-3.6.0 use of set.seed(), you will get different results.

You can see that in the examples below:

R 3.2.0

> set.seed(1999)
> sample(LETTERS, 3)
[1] "T" "N" "L"

R 3.5.3

> set.seed(1999)
> sample(LETTERS, 3)
[1] "T" "N" "L"

R 3.6.0

set.seed(1999)
sample(LETTERS, 3)
[1] "D" "Z" "R"

The reason for the inconsistency is that in R 3.6.0, the default kind of under-the-hood random-number generator was changed. Now, in order to get the results from set.seed() to match, you have to first call the function RNGkind(sample.kind = "Rounding").

R 3.6.0

> RNGkind(sample.kind = "Rounding")
Warning message:
In RNGkind(sample.kind = "Rounding") : non-uniform 'Rounding' sampler used
> set.seed(1999)
> sample(Letters, 3)
[1] "T" "N" "L"
bschneidr
  • 6,014
  • 1
  • 37
  • 52
3

Having tested on several R versions (3.1.0, 3.3.1, 3.4.2) and two different machines (Windows 7 x64, Windows 10 x64), I got the same runif() random numbers with the same set.seed() independently of the R versions and the operating system. As far as I know, this suggests a yes for both questions 1 and 2.

cdermont
  • 138
  • 1
  • 8
  • Thank you for your reply. I realised that this could be tested rather easily. So I ran set.seed(75842) rnorm(3) On two computers, using different operative systems and different versions of R. In both cases I got [1] 1.5704983 -0.9103801 0.6197490 So it appears that it will be safe to upgrade from that point of view. – Phil Nov 09 '17 at 12:07
2

As stated in accepted answer, the default algorithm for sampling changed from version 3.6.0 onwards:

sample.kind can be "Rounding" or "Rejection", or partial matches to these. The former was the default in versions prior to 3.6.0: it made sample noticeably non-uniform on large populations, and should only be used for reproduction of old results. See PR#17494 for a discussion.

In R versions 3.6.0 and later, you can keep consistency by using the older algorithm in a one liner:

set.seed(<seed_number_here>, sample.kind = "Rounding")

Notice that sample.kind option was added in 3.6.0 so you cannot use the newer, better method on older R versions (that is, set.seed(<seed_number_here>, sample.kind = "Rejection") does not work in R versions prior to 3.6.0).

luchonacho
  • 6,759
  • 4
  • 35
  • 52