We generally avoid using double-for loop with numpy; it's slow and with smart indexing (array[:, i-N]...) we can do a lot in a single loop.
But for your convolution problem, it might be the easiest ~~(and only ?)~~ way to do what you want. (edit: it's not. See below @Masoud's answer).
Bpmap_data=np.array(a[row-N:row+N,column-N:column+N].ravel())
is creating a new array at each loop.
Computing median and std without creating a new array (ie using the numpy view directly) would be much faster.
In fact, it's 40 times faster (on my google colab instance)
To get a 400 times faster algo, please look at @Masoud answer that is using scipy filter for 2D-array.
import numpy as np
from astropy.stats import sigma_clipped_stats
N=10
a=np.random.random((80,40))
def f():
"""Your original code"""
BPmap=[]
for row in range(N,a.shape[0]-N):
BPmap_row=[]
for column in range(N,a.shape[1]-N):
Bpmap_data=np.array(a[row-N:row+N,column-N:column+N].ravel())
mean, median, std = sigma_clipped_stats(Bpmap_data, sigma=3,iters=5)
BPmap_Nsigma=float(a[row][column]-median)/std
BPmap_row.append(BPmap_Nsigma)
BPmap.append(BPmap_row)
return BPmap
def f2():
"""this little guy is improving a lot your work"""
BPmap=[]
for row in range(N,a.shape[0]-N):
BPmap_row=[]
for column in range(N,a.shape[1]-N):
# the next 3 lines do not need any more memory
view = a[row-N:row+N,column-N:column+N]
std_without_outliers = view[view - view.mean() < 3*view.std()].std()
median = np.median(view)
# back to your workflow
BPmap_Nsigma=float(a[row][column]-median)/std_without_outliers
BPmap_row.append(BPmap_Nsigma)
BPmap.append(BPmap_row)
return BPmap
%time _ = f()
%time _ = f2()
f() == f2()
>>>CPU times: user 39.7 s, sys: 14.2 ms, total: 39.7 s
Wall time: 39.7 s
CPU times: user 969 ms, sys: 2.99 ms, total: 972 ms
Wall time: 976 ms
True
Edit
In fact, sigma_clipped_stats(a[row-N:row+N,column-N:column+N])
does slow down the loop. I suspect sigma_clipped_stats
of creating a copy of its argument.
im taking the std after getting rid of outliers from a 3 sigma cut
I'm showing here a way to do this with pure numpy; this is really faster than the function you used before.
In the end, f() = f2() so why using this astropy
function anymore ?