2

Over the past years the random number generation in NumPy has gotten an overhaul. The modern way of setting up reproducible random numbers is now something like this:

# Set up random number generator
import numpy as np
stream = np.random.PCG64DXSM
seed = 42
rng = np.random.Generator(stream(seed))

# Draw random numbers from various distributions
a = rng.uniform (0, 1, size=10)  # uniform  distribution
b = rng.normal  (0, 1, size=10)  # normal   distribution
c = rng.rayleigh(1,    size=10)  # Rayleigh distribution
...

This work great. The three arrays a, b and c all contain double-precision floats, i.e. their dtype is np.float64. How would I go about changing this to e.g. np.float32?

I can of course just convert the results to 32-bit, but that's not optimal (and this doesn't work at all if we extend the question to 128-bit). Getting 32-bit results are supported in the docs, but I can't find out how to do it using the above system of first defining an rng using np.random.Generator(stream(seed)). Another doc entry states that

Optional dtype argument that accepts np.float32 or np.float64 ...

but again I cannot seem to find where in the above such an argument may be placed.

A similar question has been asked before, but all results pertain to older NumPy interfaces than the modern one showcased above.

To be concrete, let's focus on the newest version of NumPy (1.22), which do come with some changes to the random number generation, though I think only bug fixes.

jmd_dk
  • 12,125
  • 9
  • 63
  • 94

1 Answers1

1

The documentation that you linked to refers to the "Optional dtype argument that accepts np.float32 or np.float64 to produce either single or double precision uniform random variables for select distributions"; in fact, there are just four "select distributions" that accept the dtype argument: random (uniform distribution on [0, 1)), standard_exponential, standard_gamma and standard_normal.

For example,

In [26]: rng = np.random.default_rng()

In [27]: rng.random(size=10, dtype=np.float32)
Out[27]: 
array([0.13855255, 0.33742118, 0.51912105, 0.59625137, 0.9547056 ,
       0.20756459, 0.4408238 , 0.37817073, 0.62999   , 0.36310232],
      dtype=float32)

In [28]: rng.standard_gamma(shape=6.5, size=10, dtype=np.float32)
Out[28]: 
array([ 4.99139  ,  7.1637936,  5.1287727,  5.8192725,  7.1051316,
        1.9996499, 14.339245 ,  6.158779 ,  5.6176057,  6.186189 ],
      dtype=float32)

For other distributions, you would have to generate 64 bit values and cast to float32, as you suggested.

Warren Weckesser
  • 110,654
  • 19
  • 194
  • 214
  • @EricPostpischil, good point about casting from float64 to float32. Fortunately, the `random` method is one of the distributions that has the dtype parameter, so you wouldn't have to cast a float64 output to float32. If you want float32, use `dtype=np.float32` to get float32 output directly. – Warren Weckesser Jan 21 '22 at 14:34
  • Sorry, I do not know what made me think there was a conversion in here. (I hope the routines are not converting internally, but I have little faith in that; it would be so easy for somebody to return a “float” result by generating a 32-bit integer and dividing by 2^32, which will round some values to 1. Maybe somebody can generate a few hundred million `float32` values and see if any of them are exactly 1?) – Eric Postpischil Jan 21 '22 at 14:36
  • 1
    @EricPostpischil [numpy](https://github.com/numpy/numpy/blob/eee12b6fcf2b05bcdd7fc55da1a5df5d3bee9c28/numpy/random/src/distributions/distributions.c#L19) is doing `(next_uint32(bitgen_state) >> 8) * (1.0f / 16777216.0f)` which looks reasonable to me – Sam Mason Jan 26 '22 at 19:34