2

I'm presently trying to run a vectorised batch multivariate sampling operation via Numpy. I have k mean vectors of shape [N,] corresponding to k covariance matrices of dimensions [N, N], and I'm trying to return k draws of shape [N,] from the multivariate normal distributions.

I presently have a loop that does the above,

for batch in range(batch_size):
    c[batch, :] = np.random.multivariate_normal(mean = a[batch, :], cov = b[batch, :, :])

but would like to consolidate the above into a vectorised operation. The issue is that np.random.multivariate_normal can only take a 1-D array as the mean and a 2-D array as the covariance.

I can do batch-sampling via PyTorch's multivariate normal class, but I'm trying to integrate with some pre-existing Numpy code, and I'd prefer to limit the number of conversions happening.

Googling pulled up this question, which could be resolved by melting the mean, but in my case, I'm not using the same covariance matrix and can't go about things exactly the same way.

Thank you very much for your help. I figure there's a good chance that I won't be able to handle batches using the Numpy distribution because of the argument constraints, but wanted to make sure I wasn't missing anything.

potpie
  • 91
  • 1
  • 8
  • Have you tried to use tensorflow's probability module? – Hakan Akgün Oct 01 '21 at 20:31
  • @HakanAkgün, I actually ultimately re-implemented my code using PyTorch to handle the batching, so was able proceed. Nonetheless, I'd still be curious to know if there's a way of doing this in Numpy. – potpie Oct 02 '21 at 23:45
  • I guess rather than using for loop it's not possible. However, I think TensorFlow has a quite easy function for that, you can check it https://www.tensorflow.org/probability/api_docs/python/tfp/distributions/MultivariateNormalDiag – Hakan Akgün Oct 03 '21 at 07:39

1 Answers1

1

I couldn't find a builtin function in numpy, but it can be self-implemented by performing a cholesky decomposition of the covariance matrix Σ = LLᵀ and then making use of the fact that, given a vector X of i.i.d. standard normal variables, the transformation LX + µ has covariance Σ and mean µ.

This can be implemented using e.g. np.linalg.cholesky() (note that this function supports batch mode!), and np.random.normal():

# cov:    (*B, D, D)
# mean:   (*B, D)
# result: (*S, *B, D)
L = np.linalg.cholesky(cov)
X = np.random.standard_normal((*S, *B, D, 1))
Y = (L @ X).reshape(*S, *B, D) + mean

Here, packed in a function for easier use:

import numpy as np


def sample_batch_mvn(
    mean: np.ndarray,
    cov: np.ndarray,
    size: "tuple | int" = (),
) -> np.ndarray:
    """
    Batch sample multivariate normal distribution.

    Arguments:

        mean: expected values of shape (…M, D)
        cov: covariance matrices of shape (…M, D, D)
        size: additional batch shape (…B)

    Returns: samples from the multivariate normal distributions
             shape: (…B, …M, D)

    It is not required that ``mean`` and ``cov`` have the same shape
    prefix, only that they are broadcastable against each other.
    """
    mean = np.asarray(mean)
    cov = np.asarray(cov)
    size = (size, ) if isinstance(size, int) else tuple(size)
    shape = size + np.broadcast_shapes(mean.shape, cov.shape[:-1])
    X = np.random.standard_normal((*shape, 1))
    L = np.linalg.cholesky(cov)
    return (L @ X).reshape(shape) + mean

Now in order to test this function, we first need a good batch of covariance matrices. We'll generate a couple to test the sampling performance a bit:

# Generate N batch of D-dimensional covariance matrices C:
N = 5000
D = 2

L = np.zeros((N, D, D))
L[(..., *np.tril_indices(D))] = \
    np.random.normal(size=(N, D * (D + 1) // 2))
cov = L @ np.swapaxes(L, -1, -2)

The method used to generate the covariance matrices here in fact works by sampling the Cholesky factors L. With prior knowledge of these factors, we of course wouldn't need to compute the Cholesky decomposition in the sampling function. However, to test the general applicability of the function, we will forget about them and just pass the covariance matrices C:

mean = np.zeros(2)
samples = sample_batch_mvn(mean, cov, 1000)

print(samples.shape)   # (1000, 5000, 2)

Sampling these 5 million 2D vectors takes about 0.4s on my PC.

And, as almost always, the a considerable amount of effort will go into plotting (here showing some samples for the first 9 of the 5000 covariance matrices):

Samples and probability density function

import scipy.stats as stats
import matplotlib.pyplot as plt


fig, axs = plt.subplots(3, 3, figsize=(9, 9))
for ax, i in zip(axs.ravel(), range(5000)):
    cc = cov[i]

    xsamples = samples[:100, i, 0]
    ysamples = samples[:100, i, 1]

    xmin = xsamples.min()
    xmax = xsamples.max()
    ymin = ysamples.min()
    ymax = ysamples.max()
    xpad = (xmax - xmin) * 0.05
    ypad = (ymax - ymin) * 0.05

    xlim = (xmin - xpad, xmax + xpad)
    ylim = (ymin - ypad, ymax + ypad)
    xs = np.linspace(*xlim, num=51)
    ys = np.linspace(*ylim, num=51)
    xy = np.dstack(np.meshgrid(xs, ys))

    pdf = stats.multivariate_normal.pdf(xy, mean, cc)

    ax.contourf(xs, ys, pdf, 33, cmap='YlGnBu')
    ax.plot(xsamples, ysamples, 'r.', alpha=.6,
            markeredgecolor='k', markeredgewidth=0.5)

    ax.set_xlim(*xlim)
    ax.set_ylim(*ylim)

plt.show()

Some inspiration for this:

coldfix
  • 6,604
  • 3
  • 40
  • 50