19

I have the latest version of R (3.6.1), but when I use functions that are using the random number generator, they default to an older (i.e. pre-3.6.0) RNG which uses Rounding instead of Rejection for sampling. I am not sure why this is happening, and would appreciate your help resolving it.

set.seed(1)
sample(20)
RNGkind()
R.version

Below are the results of my run:

set.seed(1)
sample(20)
# 6  8 11 16  4 14 15  9 19  1  3  2 20 10  5  7 12 17 18 13

RNGkind()
# "Mersenne-Twister" "Inversion"        "Rounding"

R.version

platform       x86_64-w64-mingw32                         
arch           x86_64                                     
os             mingw32                                    
system         x86_64, mingw32                            
status         Patched                                    
major          3                                          
minor          6.1                                        
year           2019                                       
month          09                                         
day            06                                         
svn rev        77160                                      
language       R                                          
version.string R version 3.6.1 Patched (2019-09-06 r77160)
nickname       Action of the Toes        

Based on the NEWS and the linked discussion, I am expecting the output of RNGkind() to look as follows instead:

# "Mersenne-Twister" "Inversion"        "Rejection"

Am I misunderstanding the NEWS?

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
Dave
  • 329
  • 2
  • 10
  • Lots of use have image hosting sites blocked at work. Please post a reproducible question. https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Bill O'Brien Sep 10 '19 at 16:20
  • Please run the code that I posted. If you are running it on R 3.6.0 or later, the chances are your results will be different from mine, since your version is likely to default to Rejection as a RNG. While my R version (which is the latest) defaults to an earlier method (Rounding). That's the whole question. An image is posted only as a proof that my run is producing results different from what should be expected under the latest version. – Dave Sep 10 '19 at 16:25
  • 3
    Edited, thanks for the feedback. – Dave Sep 10 '19 at 16:30
  • 3
    @DirkEddelbuettel, thank you - I am aware that there is a way to manually set the version (and i can in fact manually set RNGVersion to Rejection), but my question is - why is my version of 3.6.1 still defaulting to old (Rounding) although a change has been introduced. I would expect that after downloading 3.6.1 I would get the updated RNG by default, while i am not. Sorry if I am not being clear, but I hope that the above details will clarify. – Dave Sep 10 '19 at 16:33
  • @Dave again, read the news file. This has nothing to do with Rounding-vs-Rejection. – Dirk Eddelbuettel Sep 10 '19 at 16:35
  • 2
    @DirkEddelbuettel, can you please be a little more specific as to what the answer to my particular question is? Or at least direct to the appropriate source where my question is being answered? I am sorry but it is not awfully helpful to just say "read the news file". And where do I even look for the "news file"? Thanks again in advance. – Dave Sep 10 '19 at 16:40
  • @Dave The NEWS file is here: https://cran.r-project.org/doc/manuals/r-release/NEWS.html – Konrad Rudolph Sep 10 '19 at 16:43
  • 1
    @KonradRudolph thanks for sharing. Unfortunately, all I could find from that file was what I had already discovered, namely, that "The default method for generating from a discrete uniform distribution (used in sample(), for instance) has been changed. This addresses the fact, pointed out by Ottoboni and Stark, that the previous method made sample() noticeably non-uniform on large populations. See PR#17494 for a discussion. The previous method can be requested using RNGkind() or RNGversion() if necessary for reproduction of old results.". But that unfortunately does not answer my question ... – Dave Sep 10 '19 at 16:49
  • Also: `db <- news(); news(Version=="3.6.0", db=db)` display _just_ the pertinent 3.6.0 notes. You can search further, see `help(news)`. – Dirk Eddelbuettel Sep 10 '19 at 16:49
  • 1
    @DirkEddelbuettel, then thanks for all the time dedicated. – Dave Sep 10 '19 at 16:52
  • 2
    The output you expected is what I get on R 3.6.1 running on Ubuntu 16.04, and I believe you are correct in your expectations. It may be a bug introduced in a patch, perhaps you are on a machine with a `.Rprofile` file changing the default (though if you did, you should get a warning on startup), etc. Do you get the same output if you run `RNGversion("3.6.1"); RNGkind()`? – duckmayr Sep 10 '19 at 19:02
  • 2
    @duckmayr thanks for the feedback. If I run with your suggestions, I get the "correct" output, that is - `"Mersenne-Twister" "Inversion" "Rejection" `. I could also set that by `RNGkind(sample.kind = "Rejection")`. But unfortunately I still don't know why the default is the older version of RNG if I have the latest version of R. – Dave Sep 10 '19 at 19:32
  • 7
    Do you have a .RData file in your startup folder? The random number state is stored in that file and restored when starting R. Try starting R with `R --vanilla`. – Jan van der Laan Sep 16 '19 at 13:55
  • Having had a similar problem, I'd suggest checking the standard `getwd()` upon starting a new session. As @JanvanderLaan suggeted it contained an `.RData` and `.Rhistory` file. Deleting both ensured the expected output when running `RNGkind()`. Note a new `.Rhistory` file is created upon closing the session, which i deleted as well before opening another session with the expected result. – Oliver Sep 19 '19 at 19:49

2 Answers2

8

As suggested by @JanvanderLaan in the comments, a possible problem might stem from an .RData file being loaded upon start up. For example if one had a previous version of R installed an every used it, the initial working directory from getwd() upon starting up a session will contain an .RData file and a .Rhistory file, if one ever saved the session. Usually this is the documents folder on windows if one uses Rstudio, which most individuals goes out of their way to clear of old or unusual files.

Following the suggestion in the comment, going to the directory output by getwd() in a fresh R session, I found an .RData file, closed the existing R sessions without saving the current session, and reopened a new R session. And it seems to have correctly fixed the problem as can be seen below. Thus it seems the method for generating random numbers is indeed saved between sessions within the .Rdata file.

RNGkind()
[1] "Mersenne-Twister" "Inversion"        "Rejection"   

Edit (illustration)

We can actually quite easily illustrate this in a fresh R session, regardless of which random number generator is set. Assuming one has ever opened and saved an R session prior to R-3.6.1, the following code illlustrates the problem

#Assuming that the R session has just opened
>RNGkind()
[1] "Mersenne-Twister" "Inversion"        "Rounding"  
>RNGversion("3.6.1") 
>RNGkind()
[1] "Mersenne-Twister" "Inversion"        "Rejection"  
>load(".RData", verbose = TRUE)
Loading objects:
  .Random.seed
>RNGkind()
[1] "Mersenne-Twister" "Inversion"        "Rounding"  

As can be seen, it stores the .Random.seed, however what is not shown is that the type of random number generator is also imported, upon loading the previous environment. Executing

file.remove(".RData")
q("no")

should thus fix the issue for future sessions, assuming working directory has not been changed in the current session.

Community
  • 1
  • 1
Oliver
  • 8,169
  • 3
  • 15
  • 37
4

I had hoped that setting a bounty would extract a definite answer to what caused OP's problem. While that didn't happen, some comments and answers suggested a few reasons. I provide an additional answer here to put them all in one place and provide a little better illustration for how to tell when one thing is happening versus the other.

Suggested causes:

  • Seed being set by a .RData file in the initial working directory
  • RNG type being set by .Rprofile
  • Bug in a recent patch

Seed set by a .RData file

As discussed in Oliver's answer, this could be caused by a .RData file in your initial working directory. I won't go into much more detail (you can consult the linked answer for that), but I did want to show what you would see on startup if that were the case. This is what the start up message in R looks like on my machine:

R version 3.6.1 (2019-07-05) -- "Action of the Toes"
Copyright (C) 2019 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

>

If you are reading in a .RData file on startup that could cause that, you'd see a notification about that right after the last paragraph of the startup message:

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

[Workspace loaded from ~/.RData]

RNG type set by .Rprofile

.Rprofile is a script that runs on startup that you can use to set some settings at the outset of your session. (You can read a little more about it here or here, or in the R documentation). Though I doubt this is the case for you, it is at least possible the problem was caused by a .Rprofile file being run with a line something like the following

RNGkind(sample.kind = "Rounding")

If you had such a setting in a .Rprofile file that was causing your problem, you'd see a warning at the end of your startup message:

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

Warning message:
In RNGkind(sample.kind = "Rounding") : non-uniform 'Rounding' sampler used

Bug

If you see neither of those messages at startup, my best guess is that this is caused by some kind of bug introduced in a recent patch to R 3.6.1. I kind of hesitate to say that, but I can't see another option (I had kind of hoped that offering a bounty would draw an answer that provided such another option). If so, I'd report it as a bug; find out more here.

duckmayr
  • 16,303
  • 3
  • 35
  • 53