I couldn't find a builtin function in numpy, but it can be self-implemented by performing a cholesky decomposition of the covariance matrix Σ = LLᵀ and then making use of the fact that, given a vector X of i.i.d. standard normal variables, the transformation LX + µ has covariance Σ and mean µ.
This can be implemented using e.g. np.linalg.cholesky()
(note that this function supports batch mode!), and np.random.normal()
:
# cov: (*B, D, D)
# mean: (*B, D)
# result: (*S, *B, D)
L = np.linalg.cholesky(cov)
X = np.random.standard_normal((*S, *B, D, 1))
Y = (L @ X).reshape(*S, *B, D) + mean
Here, packed in a function for easier use:
import numpy as np
def sample_batch_mvn(
mean: np.ndarray,
cov: np.ndarray,
size: "tuple | int" = (),
) -> np.ndarray:
"""
Batch sample multivariate normal distribution.
Arguments:
mean: expected values of shape (…M, D)
cov: covariance matrices of shape (…M, D, D)
size: additional batch shape (…B)
Returns: samples from the multivariate normal distributions
shape: (…B, …M, D)
It is not required that ``mean`` and ``cov`` have the same shape
prefix, only that they are broadcastable against each other.
"""
mean = np.asarray(mean)
cov = np.asarray(cov)
size = (size, ) if isinstance(size, int) else tuple(size)
shape = size + np.broadcast_shapes(mean.shape, cov.shape[:-1])
X = np.random.standard_normal((*shape, 1))
L = np.linalg.cholesky(cov)
return (L @ X).reshape(shape) + mean
Now in order to test this function, we first need a good batch of covariance matrices. We'll generate a couple to test the sampling performance a bit:
# Generate N batch of D-dimensional covariance matrices C:
N = 5000
D = 2
L = np.zeros((N, D, D))
L[(..., *np.tril_indices(D))] = \
np.random.normal(size=(N, D * (D + 1) // 2))
cov = L @ np.swapaxes(L, -1, -2)
The method used to generate the covariance matrices here in fact works by sampling the Cholesky factors L. With prior knowledge of these factors, we of course wouldn't need to compute the Cholesky decomposition in the sampling function. However, to test the general applicability of the function, we will forget about them and just pass the covariance matrices C:
mean = np.zeros(2)
samples = sample_batch_mvn(mean, cov, 1000)
print(samples.shape) # (1000, 5000, 2)
Sampling these 5 million 2D vectors takes about 0.4s on my PC.
And, as almost always, the a considerable amount of effort will go into plotting (here showing some samples for the first 9 of the 5000 covariance matrices):

import scipy.stats as stats
import matplotlib.pyplot as plt
fig, axs = plt.subplots(3, 3, figsize=(9, 9))
for ax, i in zip(axs.ravel(), range(5000)):
cc = cov[i]
xsamples = samples[:100, i, 0]
ysamples = samples[:100, i, 1]
xmin = xsamples.min()
xmax = xsamples.max()
ymin = ysamples.min()
ymax = ysamples.max()
xpad = (xmax - xmin) * 0.05
ypad = (ymax - ymin) * 0.05
xlim = (xmin - xpad, xmax + xpad)
ylim = (ymin - ypad, ymax + ypad)
xs = np.linspace(*xlim, num=51)
ys = np.linspace(*ylim, num=51)
xy = np.dstack(np.meshgrid(xs, ys))
pdf = stats.multivariate_normal.pdf(xy, mean, cc)
ax.contourf(xs, ys, pdf, 33, cmap='YlGnBu')
ax.plot(xsamples, ysamples, 'r.', alpha=.6,
markeredgecolor='k', markeredgewidth=0.5)
ax.set_xlim(*xlim)
ax.set_ylim(*ylim)
plt.show()
Some inspiration for this: