0

I am trying to call scipy.stats.multivariate_normal with four different parameters for mu and sigma. And then for each generated probability density function I need to call that pdf on an array of say, 10 values.

For simplicity let's say that above mentioned function is addXY:

def addXY(x, y):
    return x+y

params=[[1,2],[1,3],[1,4],[1,5]]       # mu and sigma, four versions
inputs=[1,2,3]                         # values, in this case 3 of them

matrix = []
for pdf_params in params:
    row = []
    for inp in inputs:
        entry = addXY(*pdf_params) 
        row.append(entry*inp)
    matrix.append(row)
print matrix
  1. Is this pythonic?

  2. Is there a way to pass params and inputs and get a matrix with all combinations in it that is more pythonic/vectorized/faster?

!Important notice: Inputs in the example are scalar values (I've set scalar values to simplify problem description, I am actually using array of n-dimensional vectors and thus multivariate_normal pdf).

Hints and tips about similar operations are welcome.

Dusan J.
  • 303
  • 5
  • 19
  • *"I've set scalar values to simplify problem description..."* Heh, I think you simplified too much :) (see my answer). How about showing a more appropriate example? You'll need each *mu* to be an n-dimensional vector, and each "sigma" will actually be an n-by-n covariance matrix. – Warren Weckesser Jan 07 '16 at 21:14
  • Yes, you are right! Is it possible to do it with the updated info? – Dusan J. Jan 07 '16 at 21:21
  • Unforunately it looks like `multivariate_normal.pdf` does not broadcast its arguments, so you'll have write a loop to handle different mean and covariance values. – Warren Weckesser Jan 07 '16 at 21:22

2 Answers2

1

Based on your description of what you are trying to compute, you don't need multivariate_normal. You are calling the PDF method with a set of scalar values for a distribution with a scalar mu and sigma. So you can use the pdf() method of scipy.stats.norm. This method will broadcast its arguments, so by passing in arrays with the proper shape, you can compute the PDF for the different values of mu and sigma in one call. Here's an example.

Here are your x values (you called them inputs), and the parameters:

In [23]: x = np.array([1, 2, 3])

In [24]: params = np.array([[1, 2], [1, 3], [1, 4], [1, 5]])

For convenience, separate the parameters into arrays of mu and sigma values.

In [25]: mu = params[:, 0]

In [26]: sig = params[:, 1]

We'll use scipy.stats.norm to compute the PDF.

In [27]: from scipy.stats import norm

This call computes the PDF for the desired combinations of x and parameters. mu.reshape(-1, 1) and sig.reshape(-1, 1) are 2D arrays with shape (4, 1). x has shape (3,), so when these arguments are broadcast, the result has shape (4, 3). Each row is the PDF evaluated at x for one of the pairs of mu and sigma.

In [28]: p = norm.pdf(x, loc=mu.reshape(-1, 1), scale=sig.reshape(-1, 1))

In [29]: p
Out[29]: 
array([[ 0.19947114,  0.17603266,  0.12098536],
       [ 0.13298076,  0.12579441,  0.10648267],
       [ 0.09973557,  0.09666703,  0.08801633],
       [ 0.07978846,  0.07820854,  0.07365403]])

In other words, the rows of p are:

norm.pdf(x, loc=mu[0], scale=sig[0])
norm.pdf(x, loc=mu[1], scale=sig[1])
norm.pdf(x, loc=mu[2], scale=sig[2])
norm.pdf(x, loc=mu[3], scale=sig[3])
Warren Weckesser
  • 110,654
  • 19
  • 194
  • 214
  • I have just previewed the answer and I believe that it is the solution to my question, but I have one addition to add and would be really nice if you updated your question. In the given example I used function addXY and scalar values for input for the sake of simplifying the problem description. I would of course use an array of vectors, not scalars, for multivariate_normal pdf input. – Dusan J. Jan 07 '16 at 21:14
0

This is only my idea to shorten the code and utilize more library.

In your code, in fact, you do not use numpy, scipy. Question will be whether you would like to use numpy.array for further data processing.

Option 1: just use list to present array and list of list to present matrix:

from itertools import product
matrix_list = [sum(param)*input_x for param, input_x in product(params, inputs)]
matrix = zip(*[iter(matrix_list)]*len(inputs))
print matrix

Credit for using zip method should be given to convert a flat list to list of list in python

Option 2: use numpy.array and numpy.matrix for further processing

from itertools import product
import numpy as np
matrix_array = np.array([sum(param)*input_x for param, input_x in      product(params, inputs)])
matrix = matrix_array.reshape(len(params),len(inputs))
print matrix
Community
  • 1
  • 1
White
  • 627
  • 4
  • 10
  • Yes but the addXY function above is a dummy function used for simplifying the example. Actual function is multivariate_normal with with inputs of n-dimensional vectors. Thanks for trying anyways! – Dusan J. Jan 07 '16 at 21:00