Python efficient vectorization for Monte Carlo based Pi calculation

Question

For approximating the value of Pi consider this stochastic method that populates an array with random values and tests for unit circle inclusion,

import random as rd
import numpy as np

def r(_): return rd.random()

def np_pi(n):
    v_r = np.vectorize(r)
    x = v_r(np.zeros(n))
    y = v_r(np.zeros(n))

    return sum (x*x + y*y <= 1) * 4. / n

Note the random number generation relies on Python standard library; consider though numpy random generation,

def np_pi(n):
   x = np.random.random(n)
   y = np.random.random(n)

    return sum (x*x + y*y <= 1) * 4. / n

Consider now the non-vectorized approach,

import random as rd

def dart_board():
    x,y = rd.random(), rd.random()
    return (x*x + y*y <= 1)

def pi(n):
    s = sum([dart_board() for _ in range(n)])
    return s * 4. / n

The non-vectorized form proves 4 times faster in average than the vectorized counterpart, for instance consider n = 5000000 and OS command line as follows (Python 2.7, Quadcore, 8GB RAM, RedHat Linux),

time python pi.py
time python np_pi.py

Thus to ask how to improve the vectorized approach to improve its performance.

Note that using Monte-Carlo methods to calculate pi is [hopelessly inefficient](http://stackoverflow.com/a/18141358/2647279), so it is useless for anything other than demonstrating Monte Carlo techniques. Better to use some sort of [series approximation](http://mathworld.wolfram.com/PiFormulas.html) if you want a fast method to calculate pi. — Bas Swinckels, Feb 01 '15 at 22:28
@BasSwinckels fully agree, this was more of a first exploration on numpy and its efficiency. — elm, Feb 02 '15 at 20:59

score 5 · Accepted Answer · answered Jan 30 '15 at 08:38

You are invoking the python builtin sum, rather than numpy's vectorized method sum:

import numpy as np
import random as rd

def np_pi(n):
    x = np.random.random(n)
    y = np.random.random(n)

    return (x*x + y*y <= 1).sum()

def dart_board():
    x,y = rd.random(), rd.random()
    return (x*x + y*y <= 1)

def pi(n):
    s = sum([dart_board() for _ in range(n)])

Timing results are now much different:

In [12]: %timeit np_pi(10000)
1000 loops, best of 3: 250 us per loop

In [13]: %timeit pi(10000)
100 loops, best of 3: 3.54 ms per loop

It is my guess that calling the builtin sum on a numpy-array causes overhead by iterating over the array, rather than using vectorized routines.

Python efficient vectorization for Monte Carlo based Pi calculation

1 Answers1