I've got a question. I've used these SO threads this, this and that to get where I'm now.
I've got a DEM file and coordinates + data from weather-stations. Now, I would like to interpolate the air temperature data following the GIDS model (Model 12 in this article) using my DEM. For selection of stations I want to use the 8 nearest neighbors using KDTree.
In short, (I think) I want to evaluate a function at every cell using the coordinates and elevation of my DEM.
I've developed a working function which uses x,y as input to evaluate each value of my grid. See details my IPython Notebook.
But now for a whole numpy array. I somehow understand that I have to vectorize my function so that I can apply it on a Numpy array instead of using a double loop. See my simplified code to evaluate my function on the array using both for-loop and a trial with a vectorized function using a numpy meshgrid. Is this the way forward?
>>> data = [[0.8,0.7,5,25],[2.1,0.71,6,35],[0.75,2.2,8,20],[2.2,2.1,4,18]]
>>> columns = ['Long', 'Lat', 'H', 'T']
>>> df = pd.DataFrame(data, columns=columns)
>>> tree = KDTree(zip(df.ix[:,0],df.ix[:,1]), leafsize=10)
>>> dem = np.array([[5,7,6],[7,9,7],[8,7,4]])
>>> print 'Ground points\n', df
Ground points
Long Lat H T
0 0.80 0.70 5 25
1 2.10 0.71 6 35
2 0.75 2.20 8 20
3 2.20 2.10 4 18
>>> print 'Grid to evaluate\n', dem
Grid to evaluate
[[5 7 6]
[7 9 7]
[8 7 4]]
>>> def f(x,y):
... [see IPython Notebook for details]
... return m( sum((p((d(1,di[:,0])),2)))**-1 ,
... sum(m(tp+(m(b1,(s(pix.ix[0,0],longp))) + m(b2,(s(pix.ix[0,1],latp))) + m(b3,(s(pix.ix[0,2],hp)))), (p((d(1,di[:,0])),2)))) )
...
>>> #Double for-loop
...
>>> tp = np.zeros([dem.shape[0],dem.shape[1]])
>>> for x in range(dem.shape[0]):
... for y in range(dem.shape[1]):
... tp[x][y] = f(x,y)
...
>>> print 'T predicted\n', tp
T predicted
[[ 24.0015287 18.54595636 19.60427132]
[ 28.90354881 20.72871172 17.35098489]
[ 54.69499782 43.79200925 15.33702417]]
>>> # Evaluation of vectorized function using meshgrid
...
>>> x = np.arange(0,3,1)
>>> y = np.arange(0,3,1)
>>> xx, yy = np.meshgrid(x,y, sparse=True)
>>> f_vec = np.vectorize(f) # vectorization of function f
>>> tp_vec = f_vec(xx,yy).T
>>> print 'meshgrid\nx\n', xx,'\ny\n',yy
meshgrid
x
[[0 1 2]]
y
[[0]
[1]
[2]]
>>> print 'T predicted using vectorized function\n', tp_vec
T predicted using vectorized function
[[ 24.0015287 18.54595636 19.60427132]
[ 28.90354881 20.72871172 17.35098489]
[ 54.69499782 43.79200925 15.33702417]]
EDIT
I used %%timeit
to check on real data with grid with size of 100,100 and results were as follow:
#double loop
for x in range(100):
for y in range(100):
tp[x][y] = f(x,y)
1 loops, best of 3: 29.6 s per loop
#vectorized
tp_vec = f_vec(xx,yy).T
1 loops, best of 3: 29.5 s per loop
Both not so great..