8

I want to supply a negative exponent for the scipy.stats.powerlaw routine, e.g. a=-1.5, in order to draw random samples:

"""
powerlaw.pdf(x, a) = a * x**(a-1)
"""

from scipy.stats import powerlaw
R = powerlaw.rvs(a, size=100)

Why is a > 0 required, how can I supply a negative a in order to generate the random samples, and how can I supply a normalization coefficient/transform, i.e.

PDF(x,C,a) = C * x**a

The documentation is here

http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.powerlaw.html

Thanks!

EDIT: I should add that I'm trying to replicate IDL's RANDOMP function:

http://idlastro.gsfc.nasa.gov/ftp/pro/math/randomp.pro

Eric Brown
  • 13,774
  • 7
  • 30
  • 71
jtlz2
  • 7,700
  • 9
  • 64
  • 114

5 Answers5

6

A PDF, integrated over its domain, must equal one. In other words, the area under a probability density function's curve must equal one.

In [36]: import scipy.integrate as integrate
In [40]: y, err = integrate.quad(lambda x: 0.5*x**(-0.5), 0, 1)

In [41]: y
Out[41]: 0.9999999999999998  # The integral is close to 1

The powerlaw density function has a domain from 0 <= x <= 1. On this domain, the integral of x**b is finite for any b > -1. When b is smaller, x**b blows up too rapidly near x = 0. So it is not a valid probability density function when b <= -1.

In [38]: integrate.quad(lambda x: x**(-1), 0, 1)
UserWarning: The maximum number of subdivisions (50) has been achieved...
# The integral blows up

Thus for x**(a-1), a must satisfy a-1 > -1 or equivalently, a > 0.

The first constant a in a * x**(a-1) is the normalizing constant which makes the integral of a * x**(a-1) over the domain [0,1] equal to 1. So you don't get to choose this constant independent of a.

Now if you change the domain to be a measurable distance away from 0, then yes, you could define a PDF of the form C * x**a for negative a. But you'd have to state what domain you want, and I don't think there is (yet) a PDF available in scipy.stats for this.

unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • To the last part: using a positive location `loc` we can shift the distribution. It would follow then from this explanation, that the restriction on `a` could be relaxed as a function of the location `loc`. Should be worth some testing and it should be possible to make the extension in scipy.stats. – Josef Jul 26 '13 at 21:47
  • 2
    @user333700: Although you could use `loc` to shift the distribution, the restriction `a > 0` would remain, since the shift is performed last, after the underlying distribution is generated. – unutbu Jul 26 '13 at 21:56
  • You are right, I didn't think about this correctly. It would need an extra shape parameter to shift and widen the support of the distribution. – Josef Jul 29 '13 at 06:46
  • I think this is not yet possible in scipy, but I really wonder why is so hard. This pdf has an analytic cumulative so it can be easily solved as pointed out by others with the method outlined in textbooks or [here](https://mathworld.wolfram.com/RandomNumber.html) – Rho Phi Jan 05 '21 at 16:54
6

The Python package powerlaw can do this. Consider for a>1 a power law distribution with probability density function

f(x) = c * x^(-a) 

for x > x_min and f(x) = 0 otherwise. Here c is a normalization factor and is determined as

c = (a-1) * x_min^(a-1).

In the example below it is a = 1.5 and x_min = 1.0 and comparing the probability density function estimated from the random sample with the PDF from the expression above gives the expected result.

import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as pl

import numpy as np
import powerlaw

a, xmin = 1.5, 1.0
N = 10000

# generates random variates of power law distribution
vrs = powerlaw.Power_Law(xmin=xmin, parameters=[a]).generate_random(N)

# plotting the PDF estimated from variates
bin_min, bin_max = np.min(vrs), np.max(vrs)
bins = 10**(np.linspace(np.log10(bin_min), np.log10(bin_max), 100))
counts, edges = np.histogram(vrs, bins, density=True)
centers = (edges[1:] + edges[:-1])/2.

# plotting the expected PDF 
xs = np.linspace(bin_min, bin_max, 100000)
pl.plot(xs, [(a-1)*xmin**(a-1)*x**(-a) for x in xs], color='red')
pl.plot(centers, counts, '.')

pl.xscale('log')
pl.yscale('log')

pl.savefig('powerlaw_variates.png')

returns

power_law

Felix Hoffmann
  • 322
  • 1
  • 4
  • 12
3

If r is a uniform random deviate U(0,1), then x in the following expression is a power-law distributed random deviate:

x = xmin * (1-r) ** (-1/(alpha-1))

where xmin is the smallest (positive) value above which the power-law distribution holds, and alpha is the exponent of the distribution.

Virgil
  • 31
  • 2
0

If you want to generate power-law distribution, you can use a random deviation. You just have to generate a random number between [0,1] and apply the inverse method (Wolfram). In this case, the probability density function is:

p(k) = k^(-gamma)

and y is the variable uniform between 0 and 1.

y ~ U(0,1)

import numpy as np

def power_law(k_min, k_max, y, gamma):
    return ((k_max**(-gamma+1) - k_min**(-gamma+1))*y  + k_min**(-gamma+1.0))**(1.0/(-gamma + 1.0))

Now to generate a distribution, you just have to create an array

nodes = 1000
scale_free_distribution = np.zeros(nodes, float)
k_min = 1.0
k_max = 100*k_min
gamma = 3.0

for n in range(nodes):
    scale_free_distribution[n] = power_law(k_min, k_max,np.random.uniform(0,1), gamma)

This will work to generate a power-law distribution with gamma=3.0, if you want to fix the average of distribution, you have to study Complex Networks cause the k_min depends of k_max and the average connectivity.

Emanuel Fontelles
  • 864
  • 1
  • 11
  • 14
0

My answer is almost the same as Virgil's above, with the crucial difference that that alpha is actually the negative exponent of powerlaw distribution

So, if r is a uniform random deviate U(0,1), then x in the following expression is a power-law distributed random deviate:

x = xmin * (1-r) ** (-1/(alpha-1))

where xmin is the smallest (positive) value above which the power-law distribution holds, and alpha is the negative exponent of the distribution, that is the P(x) = [constant] * x**-alpha