0

I have been trying to optimize this code:

def spectra(mE, a):
    pdf = lambda E: np.exp(-(a+1)*E/mE)*(((a+1)*E/mE)**(a+1))/(scipy.special.gamma(a+1)*E)
    u=[]
    n=np.random.uniform()
    E=np.arange(0.01,150, 0.1)
    for i in E: 
        u.append(scipy.integrate.quad(pdf,0,i)[0])

    f=interp1d(u, E)
    return f(n)

I was trying to create a lookup table out of f, but it appears that every time I call the function it does the integration all over again. Is there a way to put in something like an if statement, which will let me create f once for values of mE and a and then just call that afterwards?

Thanks for the help.

Cheers.

madtowneast
  • 2,350
  • 3
  • 22
  • 31

2 Answers2

2

It sounds like what you want to do is return a known value if the function is re-called with the same (mE, a) values, and perform the calculation if the input is new.

That is called memoization. See for example What is memoization and how can I use it in Python? . Note that in modern versions of Python, you can create a decorator to apply the memoization of the function, which lets you be a little neater.

Community
  • 1
  • 1
Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
0

Most probably you'll not be able to store values of spectra(x, y) and reasonably retrieve them by exact values of floating-point x and y. You rarely encounter exact same floating point values in real life.

Note that I don't think you can cache f directly, becuse it depends on a long list of floats. Possible input space of it is so large that finding a close match seems very improbable to me.

If you cache values of spectra() you could retrieve the value for a close enough pair of arguments with a reasonable probability.

The problem is searching for such close pairs. A hash table cannot work (we need imprecise matches), an ordered list and binary search cannot work either (we have 2 dimensions). I'd use a quad tree or some other form of spatial index. You can build it dynamically and efficiently search for closest known points near your given point.

If you found cached a point really close to the one you need, you can just return the cached value. If no point is close enough, you add it to the index, in hope that it will be reused in the future. Maybe you could even interpolate if your point is between two known points close by.

The big prerequisite, of course, is that sufficient number of points in the cache has a chance to be reused. To estimate this, run your some of your calculation and store (mE, a) pairs somewhere (e.g in a file), then plot them. You'll instantly see if you have groups of points close to one another. You can look for tight clusters without plotting, too, of course. If you have enough clusters that are tight (where you could reuse one point's value for another), your cache will work. If not, don't bother implementing it.

9000
  • 39,899
  • 9
  • 66
  • 104