This comes down to the use of different algorithms for randomized selection. There are numerous equivalent ways to select items at random with replacement using a pseudorandom generator (or to generate random variates from any other distribution). In particular, the algorithm for numpy.random.choice
need not make use of numpy.random.randint
in theory. What matters is that these equivalent ways should produce the same distribution of random variates. In the case of NumPy, look at NumPy's source code.
Another, less important, reason for different results is that the two different selection procedures (randint
and choice
) produce pseudorandom numbers themselves, which can differ from each other because the selection procedures didn't begin with the same seed (more precisely, the same sequence of pseudorandom numbers). If we set the seed to the same value before beginning each procedure:
np.random.seed(335)
y=np.random.normal(0,1,5)
b=np.empty(len(y))
np.random.seed(999999) # Seed selection procedure 1
for j in range(len(y)):
a = np.random.randint(1,len(y))
b[j] = y[a-1]
np.random.seed(999999) # Seed selection procedure 2
c = np.random.choice(y, size=5)
print(b)
print(c)
then each procedure will begin with the same pseudorandom numbers. But even so, the two procedures may use different algorithms for random selection, and these differences may still lead to different results.
(However, numpy.random.*
functions, such as randint
and choice
, have become legacy functions as of NumPy 1.17, and their algorithms are expected to remain as they are for backward compatibility reasons. That version didn't deprecate any numpy.random.*
functions, however, so they are still available for the time being. See also this question. In newer applications you should make use of the new system introduced in version 1.17, including numpy.random.Generator
, if you have that version or later. One advantage of the new system is that the application relies less on global state.)