2

I have an application where I need numpy.random.normal but from a crypgoraphic PRNG source. Numpy doesn't seem to provide this option.

The best I could find was numpy.random.entropy.random_entropy but this is only uint32 and it's buggy, with large arrays you get "RuntimeError: Unable to read from system cryptographic provider" even though urandom is non-blocking...

You can however do this: np.frombuffer(bytearray(os.urandom(1000*1000*4)), dtype=np.uint32).astype(np.double).reshape(1000, 1000)

But I'm still left with the problem of somehow converting it to a Guassian and not screwing anything up.

Is there a solution someone knows? Google is poisoned by numpy seeding from /dev/urandom, I don't need seeding, I need urandom being the only source of all randomness.

  • I guess with appropriate casting etc you can get a _uniform_ distribution from `dev/urandom`. So what you want to look for is something like "How to convert a uniform distribution to a normal distribution": https://mathoverflow.net/questions/28557/how-do-i-convert-a-uniform-value-in-0-1-to-a-standard-normal-gaussian-distri – cadolphs Jan 29 '20 at 16:55
  • normalize int values sampled from urandom to [0,1], then use inverse CDF to convert to normally distributed random. (upd: Lagerbaer's like suggests the same) – Marat Jan 29 '20 at 16:56
  • To supplement the answer linked by @Lagerbaer, [this answer](https://stackoverflow.com/a/42768755/577088) has a fast cython implementation of the polar Box-Muller transform that you might find useful. – senderle Jan 29 '20 at 20:09

1 Answers1

1

I came across scipy.stats.rvs_ratio_uniforms and adapted their code for my purpose. It's only 3 times slower than np.random.normal despite sampling twice the randomness from a cryptographic source.

import numpy as np
import os


def uniform_0_1(size):
    return np.frombuffer(bytearray(os.urandom(size*4)), dtype=np.uint32).astype(np.float) / 2**32

def normal(mu, sigma, size):
    bmp = np.sqrt(2.0/np.exp(1.0)) # about 0.8577638849607068

    size1d = tuple(np.atleast_1d(size))
    N = np.prod(size1d)  # number of rvs needed, reshape upon return

    x = np.zeros(N)
    simulated = 0

    while simulated < N:
        k = N - simulated

        a = uniform_0_1(size=k)
        b = (2.0 * uniform_0_1(size=k) - 1.0) * bmp

        accept = (b**2 <= - 4 * a**2 * np.log(a))
        num_accept = np.sum(accept)

        if num_accept > 0:
            x[simulated : (simulated + num_accept)] = (b[accept] * sigma / a[accept]) + mu
            simulated += num_accept


    return np.reshape(x, size1d)

One worry though, numpy.random.random_sample : Return random floats in the half-open interval [0.0, 1.0).

I'm not sure how to achieve that guarantee (never 1.0) with my uniform_0_1 or whether it even matters.