7

I have two arrays. x is the independent variable, and counts is the number of counts of x occurring, like a histogram. I know I can calculate the mean by defining a function:

def mean(x,counts):
    return np.sum(x*counts) / np.sum(counts)

Is there a general function I can use to calculate each moment from the distribution defined by x and counts? I would also like to compute the variance.

Alex Riley
  • 169,130
  • 45
  • 262
  • 238
noob4life
  • 415
  • 2
  • 5
  • 10

3 Answers3

11

You could use the moment function from scipy. It calculates the n-th central moment of your data.

You could also define your own function, which could look something like this:

def nmoment(x, counts, c, n):
    return np.sum(counts*(x-c)**n) / np.sum(counts)

In that function, c is meant to be the point around which the moment is taken, and n is the order. So to get the variance you could do nmoment(x, counts, np.average(x, weights=counts), 2).

Soravux
  • 9,653
  • 2
  • 27
  • 25
Curt F.
  • 4,690
  • 2
  • 22
  • 39
  • 1
    What is `counts`? shouldn't moment be `np.mean((x-c)^n)`? – wsdzbm Jun 27 '17 at 03:49
  • @Lee, this is really a question for the OP as I just recapitulated their usage of `counts`. It seems like its essentially a weight vector that says how much to weight each data point. – Curt F. Jun 27 '17 at 18:27
1
import scipy as sp
from scipy import stats
stats.moment(counts, moment = 2) #variance

stats.moment returns nth central moment.

1

Numpy supports order statistics now

https://numpy.org/doc/stable/reference/routines.statistics.html

  • np.average
  • np.std
  • np.var etc
Souradeep Nanda
  • 3,116
  • 2
  • 30
  • 44