translated matlab code into python, but python is way slower

Question

So I've been starting to use python recently, and i am working on a project of calculating wind exposure. I've managed my code in matlab and it runs very fast(can be done in 3 minutes), but after I translated my code into python, i am getting the same result but it takes 3hours to finish it's job. I really need a hand on checking what's causing such a huge difference...

So here's my python code. I can give out my matlab code if anyone need it.

from netCDF4 import Dataset, num2date
import numpy as np
import matplotlib.pyplot as plt
from scipy import interpolate
#import pylab as py
#input data
dem = Dataset('comparearea_fill.nc','r')
lon = np.array(dem.variables['lon'])
lat = np.array(dem.variables['lat'])
DEM = np.array(dem.variables['elevation'])
carea = Dataset('carea.nc','r')
u = np.array(carea.variables['u10'])
v = np.array(carea.variables['v10'])
mu = np.mean(u, axis=0)
mv = np.mean(v, axis=0)
x = np.linspace(1,21,21)
y = np.linspace(1,11,11)

newu = interpolate.interp2d(x, y, mu, kind='cubic')
newv = interpolate.interp2d(x, y, mv, kind='cubic')

spu = newu(lon,lat)
spv = newv(lon,lat)

A = np.zeros((4951,9451))
B = np.zeros((4951,9451))
for i in range(100,4850):
    for j in range(100,9350):
        for n in range(20):
            A[i,j] = (DEM[i,j]-np.max(DEM[np.floor(n*spv[i,j]).astype(int),j-np.floor(n*spu[i,j]).astype(int)]))/DEM[i,j]
            if  A[i,j] < 0:
                A[i,j] = 0
            B[i,j] = (DEM[i,j]-np.max(DEM[i-np.ceil(n*spv[i,j]).astype(int),j-np.ceil(n*spu[i,j]).astype(int)]))/DEM[i,j]
            if B[i,j] < 0:
               B[i,j] = 0

C = A+B
plt.contourf(lon,lat,C); plt.colorbar()

here the mu and mv are the monthly average of the u and v wind, while the spu and spv are the spline interpulated u and v wind to fit the resolution of my dem data set.

That's because you're using large for loops inside for loops inside for loops. If you want to write code like this, either use C++ or C extensions or (?) Numba. — Mateen Ulhaq, Jun 28 '18 at 23:52
Here's a low effort solution to try first. https://numba.pydata.org/ — Mateen Ulhaq, Jun 28 '18 at 23:54
Simply using another language will not make your code run faster, you have to write optimized code in that language. Many numpy functions are vectorized, so using all these nested for loops will hurt performance significantly — user3483203, Jun 28 '18 at 23:57
Why do you have a loop over `n`? You're not making use of the values `n = 0..18`. — Mateen Ulhaq, Jun 28 '18 at 23:58
There are many similar posts: https://stackoverflow.com/questions/17559140/matlab-twice-as-fast-as-numpy , https://stackoverflow.com/questions/46475162/performance-matlab-vs-python , etc etc. What’s new? MATLAB has a JIT, Python (by default) doesn’t, and therefore it’s going to be slower. — Cris Luengo, Jun 29 '18 at 00:37
Hi Mateen, i'm using a np.max command to find the highest point among the 20 points behind the current point on the wind direction, is there any problem in this method? — Jiang Lisong, Jun 29 '18 at 16:09

translated matlab code into python, but python is way slower

0 Answers0