0

I have a function that takes two axes as an input - a time axis and an energy axis - and I'm trying to figure out if there's a way to generate the resulting 2D array through broadcasting, rather than in a for loop for one of the axes.

The function looks like this, and is also included in my code example: diffusion equation

Here's what I think is the naïve approach I've tried, with the two axes have different lengths and handling the time array in a for loop:

import numpy as np

def function(zeta, tau, alpha):
    return 1/np.sqrt(2*np.pi*alpha**2*tau)*np.exp(-zeta**2/(2*alpha**2*tau))

zeta = np.linspace(-2, 2, 100)
tau  = np.logspace(1e1, 1e9, 200)
alpha = 1e-6

res = np.zeros((len(tau), len(zeta)))
for i, t in enumerate(tau):
    res[i, :] = function(zeta, t, alpha)

But what I'd like to do is this:

res = function(zeta, tau, alpha)

Which gives the (I guess expected) error:

return 1/np.sqrt(2*np.pi*alpha**2*tau)*np.exp(-zeta**2/(2*alpha**2*tau))
ValueError: operands could not be broadcast together with shapes (100,) (200,) 

So is there a way to simultaneously broadcast the function across zeta and tau, and speed up the building of the 100x200 array res?

1 Answers1

0

First, a comment on np.logspace: I believe you may want to use tau = np.logspace(1, 9, 200) rather than tau = np.logspace(1e1, 1e9, 200). The docs say:

In linear space, the sequence starts at base ** start (base to the power of start) and ends with base ** stop

The default value of base is 10.0, so start and stop of 1 and 9 would get a sequence from 10.0 ** 1 to 10.0 ** 9 which I believe may have been your intention.

Regarding the question about 2D broadcasting, I think this may be what you're looking for:

X, Y = np.meshgrid(zeta, tau)
result = function(X, Y, alpha)

The idea comes from this answer to a similar question.

UPDATE: This should give the same result and save some space:

X, Y = np.meshgrid(zeta, tau, sparse=True)

From the docs:

sparse coordinate grids are intended to be use with Broadcasting. When all coordinates are used in an expression, broadcasting still leads to a fully-dimensonal result array.

import numpy as np
def function(zeta, tau, alpha):
    return 1/np.sqrt(2*np.pi*alpha**2*tau)*np.exp(-zeta**2/(2*alpha**2*tau))

zeta = np.linspace(-2, 2, 100)
tau  = np.logspace(1, 9, 200)
alpha = 1e-6

X, Y = np.meshgrid(zeta, tau)
result = function(X, Y, alpha)
[print(v, eval(v+'.shape')) for v in ['zeta', 'tau', 'result']]
print(f"\nresult:\n{result}")

res = np.zeros((len(tau), len(zeta)))
for i, t in enumerate(tau):
    res[i, :] = function(zeta, t, alpha)
print(f"\nres.shape {res.shape}")
print(f"\nres:\n{res}")

print(f"\nallclose()? {np.allclose(result, res)}")

Output:

zeta (100,)
tau (200,)
result (200, 100)

result:
[[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 ...
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]]

res.shape (200, 100)

res:
[[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 ...
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]]

allclose()? True

Performance: Timeit for your original approach and the one above shows:

Timeit results:
foo_1 ran in 0.001397762499982491 seconds using 1000 iterations
foo_2 ran in 0.004021247100026813 seconds using 1000 iterations

So meshgrid looks to be about 3x faster for your inputs.

Full test code is here:

import numpy as np
def function(zeta, tau, alpha):
    return 1/np.sqrt(2*np.pi*alpha**2*tau)*np.exp(-zeta**2/(2*alpha**2*tau))

zeta = np.linspace(-2, 2, 100)
tau  = np.logspace(1, 9, 200)
alpha = 1e-6

from timeit import timeit

def foo_1(zeta, tau, alpha):
    X, Y = np.meshgrid(zeta, tau)
    result = function(X, Y, alpha)
    return result
def foo_2(zeta, tau, alpha):
    res = np.zeros((len(tau), len(zeta)))
    for i, t in enumerate(tau):
        res[i, :] = function(zeta, t, alpha)
    return res

n = 1000
print(f'Timeit results:')
for foo in ['foo_' + str(i + 1) for i in range(2)]:
    t = timeit(f"{foo}(zeta, tau, alpha)", setup=f"from __main__ import zeta, tau, alpha, {foo}", number=n) / n
    print(f'{foo} ran in {t} seconds using {n} iterations')
constantstranger
  • 9,176
  • 2
  • 5
  • 19