22

I understand what set.seed() does and when I might use it, but I still have many questions about the function. Here are a few:

  1. Is it possible to "reset" set.seed() to something "more random" if you have called set.seed() earlier in your session? Is that even necessary?
  2. Is it possible to view the seed that R is currently using?
  3. Is there a way to make set.seed() allow alphanumeric seeds, the way one can enter them at random.org (be sure you are in the advanced mode, and see "Part 3" of the form to see what I mean)?
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485

5 Answers5

18

Just for fun:

set.seed.alpha <- function(x) {
  require("digest")
  hexval <- paste0("0x",digest(x,"crc32"))
  intval <- type.convert(hexval) %% .Machine$integer.max
  set.seed(intval)
}

So you can do:

set.seed.alpha("hello world")

(in fact x can be any R object, not just an alphanumeric string)

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • Nice "just for fun" answer. Actually, I would want to create a vector of the seeds to use in a function. So, assuming that I am creating seeds based on an object named `village.names`, I would probably do something more like: `hexval <- paste0("0x", sapply(village.names, digest, "crc32")); intval <- type.convert(hexval) %% .Machine$integer.max` to generate a list of seeds to pass on to another function. Thanks for all your suggestions so far! – A5C1D2H2I1M1N2O1R2T1 Jun 06 '12 at 11:41
15

It's possible, if you set the seed to something like the final digits of your time epoch, but it's really not necessary. The intended use of PRNGs is that you set the seed once at the start of a session, and use successive generated variates from this. Do things differently, and you don't get to enjoy the various good theoretical and empirical properties the R RNGs have.

But I'm not sure you really understand the purpose of set.seed. It's not really there for you to get 'more random' numbers. If you are doing some kind of application for which the R PRNG is insufficient (for instance, if you require cryptographic randomness), you might as well generate all your random numbers by some alternate method and use them directly. The real purpose of set.seed is to produce reproducibility in results using RNGs. If you start the same analysis using the same sequence of random number generations, and set the seed to the same value, you will always get the same result. This is helpful in debugging, and for others reviewing your results.

To use the epoch time, do something like

t <- as.numeric(Sys.time())
seed <- 1e8 * (t - floor(t))
set.seed(seed); print(seed)
MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
Fhnuzoag
  • 3,810
  • 2
  • 20
  • 16
  • I understand the purpose of `set.seed()` in terms of making things reproducible and so on. What I was looking at is, say I do `set.seed(123); a = sample(300, 30); b = sample(300, 30)` but I'm only interested in making `a` "reproducible"--I want the results of `b` to be different each time. To me, this means that in between running the line for `a` and the line for `b`, I need to reset the seed somehow. – A5C1D2H2I1M1N2O1R2T1 Jun 06 '12 at 08:55
  • 6
    Oh okay. Well, trying something like `as.numeric(Sys.time())-> t; set.seed((t - floor(t)) * 1e8 -> seed); print(seed)` will be effective for most purposes. – Fhnuzoag Jun 06 '12 at 09:28
  • 2
    Fhnuzoag that should be an answer, not a comment, IMO. – JD Long Jun 06 '12 at 13:52
  • @JDLong, in comparison to `rm(.Random.seed, envir=globalenv())` is there any advantage to Fhnuzoag's solution other than being able to see the new random seed? In my limited use of R, I haven't done much with `globalenv()` so I'm not exactly clear of what I'm doing. I can understand what I'm doing with Fhnuzoag's solution though. I ask because the help file for `.Random.seed` states it *should not be altered by the user*, but it suggest `rm(.Random.seed)` in its examples.... – A5C1D2H2I1M1N2O1R2T1 Jun 06 '12 at 16:16
7

For your question 3 there is the char2seed function in the TeachingDemos package which will take a character string (alhpa numeric) and convert it to an integer and by default use that to set a new seed. The idea was that students could use their name (or some combination/subset of names) as a seed so each student gets a different dataset, but the teacher can reproduce each student's dataset.

Greg Snow
  • 48,497
  • 6
  • 83
  • 110
  • Thanks for the suggestion. I didn't specify it in my original question, but I need to actually create a vector of seeds based on a vector of character strings. It seems like your function isn't able to do that; am I correct? Great package, by the way. – A5C1D2H2I1M1N2O1R2T1 Jun 06 '12 at 18:56
  • 1
    The current version will only generate 1 seed, not a vector (by design for the purposes above), there is a [[1]] hard coded in. But it could be modified (use sapply instead of grabbing the first element of the result of `strsplit`) to work with vectors, or a simple use of `sapply`, `mapply`, or `Vectorize` could be used with `char2seed` to get a vector result from a vector input. – Greg Snow Jun 06 '12 at 19:23
3

For an answer to 2, first see the help page ?RNGkind.

To find the kind of RNG in use:

RNGkind()
# [1] "Mersenne-Twister" "Inversion" 

The Mersenne Twister is the default.

From the help page:

‘"Mersenne-Twister":’ From Matsumoto and Nishimura (1998). A twisted GFSR with period 2^19937 - 1 and equidistribution in 623 consecutive dimensions (over the whole period). The ‘seed’ is a 624-dimensional set of 32-bit integers plus a current position in that set.

To find the current seed in use, you need to first call the random number generator.

runif(1, 0, 1)                                                                                                                                                  
# [1] 0.9834062                                                                                                                                                      
.Random.seed
# [Gives a 626 length vector]

Calling set.seed(some_integer) followed by .Random.seed, will always give the same 626 length vector if you use the same some_integer. To put it differently, the 626-length vector is determined solely by some_integer, given one is using the Mersenne Twister, of course.

Also, of course, running set.seed to some fixed value will give you the same values for calls to random number routines following it. That's the main use for it in practice, to give reproducibility. E.g.

set.seed(1)
runif(5, 0, 1)
# [1] 0.2655087 0.3721239 0.5728534 0.9082078 0.2016819
rnorm(1, 0, 1)
# [1] 1.272429
set.seed(1)
runif(5, 0, 1)
# [1] 0.2655087 0.3721239 0.5728534 0.9082078 0.2016819
rnorm(1, 0, 1)
# [1] 1.272429

All the basic number generator code in R is in the file src/main/RNG.c in the source code.

It is in C, but fairly easy to follow.

MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
Faheem Mitha
  • 6,096
  • 7
  • 48
  • 83
1

I have the same issue as in question 1. I then figure I can simply reset seed in the loop by:

set.seed(123)
x<- rnorm(10,1,1)
set.seed(null)

This way at the end of each loop the seed just got deleted. It worked for me.

outboundbird
  • 633
  • 1
  • 7
  • 15
  • Thanks for the answer. A more common/orthodox approach would be `rm(.Random.seed, envir=globalenv())`, which is mentioned at the help file for `?.Random.seed`... which I only arrived at after this question :-) – A5C1D2H2I1M1N2O1R2T1 Feb 18 '16 at 02:43
  • `set.seed(null)` has not worked for me. Is there any other way to reset the seed value or to nullify it. – mockash Nov 03 '16 at 11:40