I have a large matrix (shape: 2e6, 6) containing geophysical data. I have 3 for
loops before I search for values in the matrix to assign my variables.
My first solution is with np.where
. It's way too slow! I read it would be better to use another for
loop to improve performance. However, the code I came up with is even slightly slower.
Does someone have an idea how to improve the performance, please?
First Solution (np.where
)
for lat in LATS:
for lon in LONS:
for depth in range(1,401,1):
node_point_line = matrix[np.where( (matrix[:,0]==lat) * (matrix[:,1]==lon) * (matrix[:,2]==depth) )][0]
var1 = node_point_line[3]
var2 = node_point_line[4]
var3 = node_point_line[5]
...
Second Solution (extra for
loop)
for lat in LATS:
for lon in LONS:
for depth in range(1,401,1):
matrix_flat = matrix.flatten()
for i in range( len( matrix_flat )):
if matrix_flat[i]==lat and matrix_flat[i+1]==lon and matrix_flat[i+2]==depth:
var1 = matrix_flat[i+3]
var2 = matrix_flat[i+4]
var3 = matrix_flat[i+5]
...
Again, both solutions are too slow. I avoid Fortran or C++ (I know it's faster). Any suggestions?