3

I've got a question. I've used these SO threads this, this and that to get where I'm now.

I've got a DEM file and coordinates + data from weather-stations. Now, I would like to interpolate the air temperature data following the GIDS model (Model 12 in this article) using my DEM. For selection of stations I want to use the 8 nearest neighbors using KDTree.

In short, (I think) I want to evaluate a function at every cell using the coordinates and elevation of my DEM.

I've developed a working function which uses x,y as input to evaluate each value of my grid. See details my IPython Notebook.

But now for a whole numpy array. I somehow understand that I have to vectorize my function so that I can apply it on a Numpy array instead of using a double loop. See my simplified code to evaluate my function on the array using both for-loop and a trial with a vectorized function using a numpy meshgrid. Is this the way forward?

>>> data = [[0.8,0.7,5,25],[2.1,0.71,6,35],[0.75,2.2,8,20],[2.2,2.1,4,18]]
>>> columns = ['Long', 'Lat', 'H', 'T']
>>> df = pd.DataFrame(data, columns=columns)
>>> tree = KDTree(zip(df.ix[:,0],df.ix[:,1]), leafsize=10)
>>> dem = np.array([[5,7,6],[7,9,7],[8,7,4]])
>>> print 'Ground points\n', df
Ground points
   Long   Lat  H   T
0  0.80  0.70  5  25
1  2.10  0.71  6  35
2  0.75  2.20  8  20
3  2.20  2.10  4  18
>>> print 'Grid to evaluate\n', dem
Grid to evaluate
[[5 7 6]
 [7 9 7]
 [8 7 4]]
>>> def f(x,y):
...     [see IPython Notebook for details]
...     return m( sum((p((d(1,di[:,0])),2)))**-1 ,
...            sum(m(tp+(m(b1,(s(pix.ix[0,0],longp))) + m(b2,(s(pix.ix[0,1],latp))) + m(b3,(s(pix.ix[0,2],hp)))), (p((d(1,di[:,0])),2)))) )
... 
>>> #Double for-loop
... 
>>> tp = np.zeros([dem.shape[0],dem.shape[1]])
>>> for x in range(dem.shape[0]):
...     for y in range(dem.shape[1]):
...         tp[x][y] = f(x,y)
... 
>>> print 'T predicted\n', tp
T predicted
[[ 24.0015287   18.54595636  19.60427132]
 [ 28.90354881  20.72871172  17.35098489]
 [ 54.69499782  43.79200925  15.33702417]]
>>> # Evaluation of vectorized function using meshgrid
... 
>>> x = np.arange(0,3,1)
>>> y = np.arange(0,3,1)
>>> xx, yy = np.meshgrid(x,y, sparse=True)
>>> f_vec = np.vectorize(f) # vectorization of function f
>>> tp_vec = f_vec(xx,yy).T
>>> print 'meshgrid\nx\n', xx,'\ny\n',yy
meshgrid
x
[[0 1 2]] 
y
[[0]
 [1]
 [2]]
>>> print 'T predicted using vectorized function\n', tp_vec
T predicted using vectorized function
[[ 24.0015287   18.54595636  19.60427132]
 [ 28.90354881  20.72871172  17.35098489]
 [ 54.69499782  43.79200925  15.33702417]]

EDIT

I used %%timeit to check on real data with grid with size of 100,100 and results were as follow:

#double loop
for x in range(100):
    for y in range(100):        
        tp[x][y] = f(x,y)
1 loops, best of 3: 29.6 s per loop

#vectorized
tp_vec = f_vec(xx,yy).T
1 loops, best of 3: 29.5 s per loop

Both not so great..

Community
  • 1
  • 1
Mattijn
  • 12,975
  • 15
  • 45
  • 68
  • 1
    Though I appreciate you've been quite careful in presenting the background and the references and so on, your question is both very specific to you and rather time consuming to answer (which minimises one's motivation to help). Can I suggest you write a minimal example of your difficulty, with embedded test data. – Henry Gomersall Sep 04 '13 at 11:55
  • I've updated the code for a simplified general version. – Mattijn Sep 05 '13 at 04:32
  • Now make it a [minimal working example](http://minimalbeispiel.de/mini-en.html) - create a single file with everything in that runs and demonstrates the problem, and post that. Strip out *everything* that is not relevant. While you're at it, try to adhere a little more closely to [pep8](http://www.python.org/dev/peps/pep-0008/); currently your code is hard to parse (most importantly, your function `f` is difficult to read - make it multiple lines and use much clearer variable names). – Henry Gomersall Sep 05 '13 at 06:55
  • @Henry, thanks for your input, it's really appreciated. I mean it. My question was how to apply a vectorized function on each cell. So I've add the answer. Now I'm struggling to optimize my function, but that's not related anymore to this question. Anywhay, thanks again. – Mattijn Sep 06 '13 at 09:45
  • Oh, I thought that was the difficulty all along! In my experience, people rarely find `np.vectorize` does quite what they want it to do. – Henry Gomersall Sep 06 '13 at 10:42

1 Answers1

3

If using vectorized function for a grid, try building a meshgrid with the shape of the dependent array. Use the components derived from the meshgrid to evaluate each grid cell using the vectorized function. Something like this

def f(x,y):
    '...some code...'
    single_value = array[x,y] # = dependent array (e.g. DEM)
    '...some code...'
    return z

x = np.arange(array.shape[0])
y = np.arange(array.shape[1])
xx, yy = np.meshgrid(x,y, sparse=True)

f_vec = np.vectorize(f) # vectorization of function f

tp_vec = f_vec(xx,yy).T
Mattijn
  • 12,975
  • 15
  • 45
  • 68