1

The griding the data (d) in irregular grid (x and y) using Scipy's griddata is timecomsuing when the datasets are many. But, the longitudes and latitudes (x and y) are always same, only the data (d) are changing. In this case, once using the giddata, how to repeat the procedure with different d arrys to achieve faster result?

import numpy as np, matplotlib.pyplot as plt
from scipy.interpolate import griddata

x = np.array([110, 112, 114, 115, 119, 120, 122, 124]).astype(float)

y = np.array([60, 61, 63, 67, 68, 70, 75, 81]).astype(float)

d = np.array([4, 6, 5, 3, 2, 1, 7, 9]).astype(float)

ulx, lrx = np.min(x), np.max(x)
uly, lry = np.max(y), np.min(y)
xi = np.linspace(ulx, lrx, 15)
yi = np.linspace(uly, lry, 15)
grided_data = griddata((x, y), d, (xi.reshape(1,-1), yi.reshape(-1,1)), method='nearest',fill_value=0)
plt.imshow(grided_data)
plt.show()

The above code works for one array of d. But I have hundreds of other arrays.

Eric Bal
  • 1,115
  • 3
  • 12
  • 16

1 Answers1

2

griddata with nearest ends up using NearestNDInterpolator. That's a class that creates an iterator, which is called with the xi:

elif method == 'nearest':
    ip = NearestNDInterpolator(points, values, rescale=rescale)
    return ip(xi)

So you could create your own NearestNDInterpolator and call it with multiple times with different xi.

But I think in your case you want to change the values. Looking at the code for that class I see

    self.tree = cKDTree(self.points)
    self.values = y

the __call__ does:

    dist, i = self.tree.query(xi)
    return self.values[i]

I don't know the relative cost of creating the tree versus query.

So it should be easy to change values between uses of __call__. And it looks like values could have multiple columns, since it's just indexing on the 1st dimension.

This interpolator is simple enough that you could write your own using the same tree idea.


Here's a Nearest Interpolator that lets you repeat the interpolation for the same points, but different z values. I haven't done timings yet to see how much time it saves

class MyNearest(interpolate.NearestNDInterpolator):
    # normal interpolation, but returns the near neighbor indices as well
    def __call__(self, *args):
        xi = interpolate.interpnd._ndim_coords_from_arrays(args, ndim=self.points.shape[1])
        xi = self._check_call_shape(xi)
        xi = self._scale_x(xi)
        dist, i = self.tree.query(xi)
        return i, self.values[i]

def my_griddata(points, values, method='linear', fill_value=np.nan,
             rescale=False):
    points = interpolate.interpnd._ndim_coords_from_arrays(points)

    if points.ndim < 2:
        ndim = points.ndim
    else:
        ndim = points.shape[-1]
    assert(ndim==2)
    # simplified call for 2d 'nearest'
    ip = MyNearest(points, values, rescale=rescale)
    return ip # ip(xi)  # return iterator, not values

ip = my_griddata((xreg, yreg), z,  method='nearest',fill_value=0)
print(ip)
xi = (xi.reshape(1,-1), yi.reshape(-1,1))
I, data = ip(xi)
print(data.shape)
print(I.shape)
print(np.allclose(z[I],data))
z1 = xreg+yreg  # new z data
data = z1[I]  # should show diagonal color bars

So as long as z has the same shape as before (and as xreg), z[I] will return the nearest value for each xi.

And it can interpolated 2d data as well (e.g. (225,n) shaped)

z1 = np.array([xreg+yreg, xreg-yreg]).T
print(z1.shape)   # (225,2)
data = z1[I]
print(data.shape)  # (20,20,2)
hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • Your idea may be better but I want to create the space first using griddata and fill the space based on available data(d).Am I wrong? – Eric Bal May 01 '15 at 20:50
  • I roughed out an interpolator that lets you get nearest values for any `z` of the correct shape just by indexing. – hpaulj May 01 '15 at 23:38