I am trying to understand how set.seed
works in R. I understand it, can reproduce random samples, but I don't know what is the difference between set.seed(1)
and set.seed(123)
?
What do the argument in the bracket mean ?
I am trying to understand how set.seed
works in R. I understand it, can reproduce random samples, but I don't know what is the difference between set.seed(1)
and set.seed(123)
?
What do the argument in the bracket mean ?
The seed
argument in set.seed
is a single value, interpreted as an integer (as defined in help(set.seed())
. The seed
in set.seed
produces random values which are unique to that seed
(and will be same irrespective of the computer you run and hence ensures reproducibility). So the random values generated by set.seed(1)
and set.seed(123)
will not be the same but the random values generated by R in your computer using set.seed(1)
and by R in my computer using the same seed
are the same.
set.seed(1)
x<-rnorm(10,2,1)
> x
[1] 1.373546 2.183643 1.164371 3.595281 2.329508 1.179532 2.487429 2.738325 2.575781 1.694612
set.seed(123)
y<-rnorm(10,2,1)
> y
[1] 1.4395244 1.7698225 3.5587083 2.0705084 2.1292877 3.7150650 2.4609162 0.7349388 1.3131471 1.5543380
> identical(x,y)
[1] FALSE
The majority of computer programs uses deterministic algorithms to generate random numbers (which is the reason why the numbers they generate are not truly random, but pseudorandom, which is good enough for most purposes). R is no different, and you can think of the random numbers it generates as being part of a very long string of seemingly-random numbers that, when summoned, just starts at some point and spits out a sequence of numbers for you.
By using set.seed()
, you are basically giving the program a starting point instead of letting it choose its own. That's why any user running the same seed number will get the same results.
You can run ?RNGkind
for more information on the subject.