I have a python code, that reads certain arrays from a text file, does some (alot) of processing, then returns a 2D array : Intensity, which is plotted using imshow. Now my code is too slow due to the use of np.where at one point into a nested loop.
I have tried alot to use multiprocessing(and joblib) module. In both cases, I found that the code kept on running (forever) without any error. With some research (Jupyter notebook never finishes processing using multiprocessing (Python 3)) I found that multiprocessing has some problems on ipython and windows (gives Broken pipe error Broken pipe error with multiprocessing.Queue and https://github.com/spyder-ide/spyder/issues/7832). I tried the workarounds given, but nothing helped.
xlin= np.linspace(-40,40,81)
ylin= np.linspace(-40,40,81)
xxlin,yylin=np.meshgrid(xlin,ylin)
Intensity = np.zeros(np.shape(xxlin))
for i in range(len(xlin)-1):
for j in range(len(ylin)-1):
k= np.where((Xnew >= xlin[i]) & (Xnew < xlin[i+1])& ( Ynew >= ylin[j]) & (Ynew < ylin[j+1]))
N1= np.shape(k)[1]
Intensity[j,i] = Intensity[j,i] + N1
#Xnew and Ynew are 1D arrays read from data file.
fig = plt.figure(2,figsize=(14,14))
plt.imshow(Intensity,origin='lower',interpolation='nearest',cmap = cm.jet)
I have no other issues except the speed of the code. After much toiling I realised that np.where is the real culprit. If someone can tell me an alternative suited to my case. I also wish if someone can help with parallelising this nested loop.