0

Below is my task code. In this case e0=15, but I would like to solve this problem for a set of e0 values (e0 - parameter (e0 = 7, 10, 15, 20, 28)). I have a multi-core processor and I would like to distribute the calculations of this task for each parameter e0 to a separate core.

How to do parallel calculations for this task in Python?

import sympy as sp
import scipy as sc
import numpy as np

e0=15
einf=15

def Psi(r,n):
    return 2*np.exp(-r/n)*np.sqrt(sc.special.factorial(n)/sc.special.factorial(-1+n))*sc.special.hyp1f1(1-n, 2, 2*r/n)/n**2

def PsiSymb(n):
    r=sp.symbols('r')
    y1=2*sp.exp(-r/n)*np.sqrt(sc.special.factorial(n)/sc.special.factorial(-1+n))/n**2
    y2 = sp.simplify(sp.functions.special.hyper.hyper([1-n], [2], 2*r/n))
    y=y1*y2
    return y

def LaplacianPsi(n):
    r = sp.symbols('r')
    ydiff = 2/r*PsiSymb(n).diff(r)+PsiSymb(n).diff(r,2)
    ydiffnum = sp.lambdify(r, ydiff, "numpy")
    return ydiffnum

def k(n1,n2):
    yint=sc.integrate.quad(lambda r: -0.5*Psi(r,n2)*LaplacianPsi(n1)(r)*r**2,0,np.inf)
    return yint[0]

def p(n1,n2):
    potC=sc.integrate.quad(lambda r: Psi(r,n2)*(-1/r)*Psi(r,n1)*(r**2),0,np.inf)
    potB1=sc.integrate.quad(lambda r: Psi(r,n2)*(1/einf-1/e0)*((einf/e0)**(3/5))*(-e0/(2*r))*(np.exp(-r*2.23))*Psi(r,n1)*(r**2),0,np.inf)
    potB2=sc.integrate.quad(lambda r: Psi(r,n2)*(1/einf-1/e0)*((einf/e0)**(3/5))*(-e0/(2*r))*(np.exp(-r*2.4))*Psi(r,n1)*(r**2),0,np.inf)
    pot=potC[0]+potB1[0]+potB2[0]
    return pot

def en(n1,n2):
    return k(n1,n2)+p(n1,n2)

nmax=3

EnM = [[0]*nmax for i in range(nmax)]

for n1 in range(nmax):
    for n2 in range(nmax):
        EnM[n2][n1]=en(n1+1,n2+1)

EnEig=sc.linalg.eigvalsh(EnM)

EnB=min(EnEig)
print(EnB)
Mam Mam
  • 99
  • 4

1 Answers1

1

This is not needed to use multiple cores for this computation. Indeed, the bottleneck is the LaplacianPsi function which recompute the same thing over and over. You can use memoization to fix this. Here is an example:

import functools

@functools.cache
def LaplacianPsi(n):
    r = sp.symbols('r')
    ydiff = 2/r*PsiSymb(n).diff(r)+PsiSymb(n).diff(r,2)
    ydiffnum = sp.lambdify(r, ydiff, "numpy")
    return ydiffnum

# The rest is the same

The code can be further optimized since sc.special.factorial(n) / sc.special.factorial(-1+n) is actually just n and np.sqrt is inefficient on scalar so it should be replaced with math.sqrt(n). This results in a code taking only 0.057 seconds as opposed to 16.5 seconds for the initial implementation on my machine. This means the new implementation is 290 times faster while it produces the same result!

Directly using many cores would just have wasted more resources for a slower result. You can still try to use more cores to compute this with the faster provided implementation though it might not be significantly faster.

Jérôme Richard
  • 41,678
  • 6
  • 29
  • 59
  • Thanks for some optimization! When using parallelization, as in the @Andrea Ianni ௫ answer, I get an error. Could you tell me how to correct it? – Mam Mam Feb 03 '23 at 17:39
  • RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase. This probably means that you are not using fork to start your child processes and you have forgotten to use the proper idiom in the main module: if __name__ == '__main__': freeze_support() ... The "freeze_support()" line can be omitted if the program is not going to be frozen to produce an executable. – Mam Mam Feb 03 '23 at 17:39
  • 1
    Did you tried to read and apply what the error says? Adding `if __name__ == '__main__'` solve the issue on my machine (and cause others). Anyway, the problem is not related to this answer but the other one. My advise is actually not to use multiprocessing unless you plan to compute something much more intensive but there are certainly other optimizations to apply. Multiprocessing is fragile, not robust nor flexible, I personally try not to use it as much as possible. – Jérôme Richard Feb 03 '23 at 18:08
  • Instead of multiprocessing, then I need to use a loop that runs through each of the parameters e0? – Mam Mam Feb 03 '23 at 18:14
  • Yes, it should work. – Jérôme Richard Feb 03 '23 at 18:19
  • but if there are several thousand parameter values, then it will be very long – Mam Mam Feb 03 '23 at 18:50
  • 1
    Well, it should take "only" few minutes. On another personal machine, this takes 0.020 s for the example value when executed multiple time (because of the cache being filled only once). I succeed to run the code using multiprocessing but it was 3 times *slower* on my 6-core processor in the end... `functools` is not great with multiprocessing since the LRU cache is replicated for each process not to mention the long time to import the modules is also replicated. It seems there are other huge overheads using multiprocessing in this case. Python is not great for parallelism. – Jérôme Richard Feb 03 '23 at 20:14
  • 1
    The current code can still be optimized by the way. For example, `Psi(r,n)` is inefficient because it is written in pure-Python and the CPython interpreter is very slow. You can use Numba (a JIT compiler) to speed it up. The `sc.special.hyp1f1` needs to be reimplemented in Numba though (which is not simple AFAIK). Some values can be also precomputed but this require to deeply change the structure of the code. – Jérôme Richard Feb 03 '23 at 20:22
  • Thanks a lot for this helpful information! Nevertheless I have one more question regarding this code. If I use nmax more than 11, then there are errors associated with the quad integration. To get rid of this error, I need to change the method of integration? How to understand which integration method is suitable for a particular function? – Mam Mam Feb 03 '23 at 23:36
  • 1
    I do not know. I am not very familiar with integration methods (I have a strong computer-science background but not a strong mathematical one). Feel free to validate the answer if it helps and post a new question specific for this last problem. – Jérôme Richard Feb 05 '23 at 22:21