29

So, I have three numpy arrays which store latitude, longitude, and some property value on a grid -- that is, I have LAT(y,x), LON(y,x), and, say temperature T(y,x), for some limits of x and y. The grid isn't necessarily regular -- in fact, it's tripolar.

I then want to interpolate these property (temperature) values onto a bunch of different lat/lon points (stored as lat1(t), lon1(t), for about 10,000 t...) which do not fall on the actual grid points. I've tried matplotlib.mlab.griddata, but that takes far too long (it's not really designed for what I'm doing, after all). I've also tried scipy.interpolate.interp2d, but I get a MemoryError (my grids are about 400x400).

Is there any sort of slick, preferably fast way of doing this? I can't help but think the answer is something obvious... Thanks!!

user391045
  • 353
  • 2
  • 4
  • 6
  • The 'irregular grid' in the title threw me off a bit. You have a sample of points that happens to be distributed across space, but you do not have the structure of the grid as in http://matplotlib.org/examples/pylab_examples/tripcolor_demo.html Your data are scattered points across a field that you can assume is somewhat smooth. Interpolation over an irregular or unstructured grid or mesh that can respect discontinuities in the field can be done with matplotlib.tri http://matplotlib.org/api/tri_api.html . – Dave X Jan 06 '17 at 16:31

6 Answers6

12

Try the combination of inverse-distance weighting and scipy.spatial.KDTree described in SO inverse-distance-weighted-idw-interpolation-with-python. Kd-trees work nicely in 2d 3d ..., inverse-distance weighting is smooth and local, and the k= number of nearest neighbours can be varied to tradeoff speed / accuracy.

Community
  • 1
  • 1
denis
  • 21,378
  • 10
  • 65
  • 88
  • You, my friend, are a genius. That KDTree class is brilliant! Exactly what I needed... – user391045 Jul 14 '10 at 19:45
  • 1
    I had some troubles with using vanilla inverse weighting. Found that it had some serious artifacting when the sample point was outside a cluster of points. I over came this by fitting a linear approximation (instead of a constant approximation) to the weighted data for the N-nearest neighbours. This produced pretty good results with the same amount of searching, just the overhead of solving a NxN linear system. – Michael Anderson Jul 14 '10 at 23:16
  • @Michael, is your data 2d, how scattered, what's Nnear ? Could you give an example of distances and values that misbehave ? E.g distances 1 1 1 1 1 10, values 1 1 1 1 1 10 => interpolate (6 / 5.1) = 1.18. Also, NxN ? In 2d, fitting a plane ax + by + c to N points (with weights say 1/dist) is either numpy.linalg .lstsq Nx3 or .solve 3x3 . – denis Jul 15 '10 at 08:31
  • My data was 3D, but the problem occurs even in 1D. Take N=3 with linear data (1,1) (2,2),(3,3), sample at 2.5 and you get a result of about 2.3 (an under estimate by 10%). Things are worse if we estimate at 3.5, producing a value near 2.5 instead of the "real" 3.5 . Some will say we're now doing extrapolation rather than interpolation, but if our data points are at 1,2,3,10 then 1,2,3 are still the three closest points to 3.5 .. and we'd end up with the same result. This is what I mean by values outside of clusters. Fitting a line gives the "right" result - at least for my data – Michael Anderson Jul 15 '10 at 08:49
9

There is a nice inverse distance example by Roger Veciana i Rovira along with some code using GDAL to write to geotiff if you're into that.

This is of coarse to a regular grid, but assuming you project the data first to a pixel grid with pyproj or something, all the while being careful what projection is used for your data.

A copy of his algorithm and example script:

from math import pow  
from math import sqrt  
import numpy as np  
import matplotlib.pyplot as plt  
  
def pointValue(x,y,power,smoothing,xv,yv,values):  
    nominator=0  
    denominator=0  
    for i in range(0,len(values)):  
        dist = sqrt((x-xv[i])*(x-xv[i])+(y-yv[i])*(y-yv[i])+smoothing*smoothing);  
        #If the point is really close to one of the data points, return the data point value to avoid singularities  
        if(dist<0.0000000001):  
            return values[i]  
        nominator=nominator+(values[i]/pow(dist,power))  
        denominator=denominator+(1/pow(dist,power))  
    #Return NODATA if the denominator is zero  
    if denominator > 0:  
        value = nominator/denominator  
    else:  
        value = -9999  
    return value  
  
def invDist(xv,yv,values,xsize=100,ysize=100,power=2,smoothing=0):  
    valuesGrid = np.zeros((ysize,xsize))  
    for x in range(0,xsize):  
        for y in range(0,ysize):  
            valuesGrid[y][x] = pointValue(x,y,power,smoothing,xv,yv,values)  
    return valuesGrid  
      
  
if __name__ == "__main__":  
    power=1  
    smoothing=20  
  
    #Creating some data, with each coodinate and the values stored in separated lists  
    xv = [10,60,40,70,10,50,20,70,30,60]  
    yv = [10,20,30,30,40,50,60,70,80,90]  
    values = [1,2,2,3,4,6,7,7,8,10]  
      
    #Creating the output grid (100x100, in the example)  
    ti = np.linspace(0, 100, 100)  
    XI, YI = np.meshgrid(ti, ti)  
  
    #Creating the interpolation function and populating the output matrix value  
    ZI = invDist(xv,yv,values,100,100,power,smoothing)  
  
  
    # Plotting the result  
    n = plt.normalize(0.0, 100.0)  
    plt.subplot(1, 1, 1)  
    plt.pcolor(XI, YI, ZI)  
    plt.scatter(xv, yv, 100, values)  
    plt.title('Inv dist interpolation - power: ' + str(power) + ' smoothing: ' + str(smoothing))  
    plt.xlim(0, 100)  
    plt.ylim(0, 100)  
    plt.colorbar()  
  
    plt.show() 
ryanjdillon
  • 17,658
  • 9
  • 85
  • 110
1

There's a bunch of options here, which one is best will depend on your data... However I don't know of an out-of-the-box solution for you

You say your input data is from tripolar data. There are three main cases for how this data could be structured.

  1. Sampled from a 3d grid in tripolar space, projected back to 2d LAT, LON data.
  2. Sampled from a 2d grid in tripolar space, projected into 2d LAT LON data.
  3. Unstructured data in tripolar space projected into 2d LAT LON data

The easiest of these is 2. Instead of interpolating in LAT LON space, "just" transform your point back into the source space and interpolate there.

Another option that works for 1 and 2 is to search for the cells that maps from tripolar space to cover your sample point. (You can use a BSP or grid type structure to speed up this search) Pick one of the cells, and interpolate inside it.

Finally there's a heap of unstructured interpolation options .. but they tend to be slow. A personal favourite of mine is to use a linear interpolation of the nearest N points, finding those N points can again be done with gridding or a BSP. Another good option is to Delauney triangulate the unstructured points and interpolate on the resulting triangular mesh.

Personally if my mesh was case 1, I'd use an unstructured strategy as I'd be worried about having to handle searching through cells with overlapping projections. Choosing the "right" cell would be difficult.

Michael Anderson
  • 70,661
  • 7
  • 134
  • 187
  • +1: ..for the mention of BSP trees and generally putting my what I was getting at more elequently than I managed :-) You could form the BSP by centering each BSP node on one of the new data points and then simply drilling down to find all the neighbouring points. – Jon Cage Jul 14 '10 at 00:34
  • Nice! The consensus seems to be that I'm going to have to work at this a bit, but that's okay. I do like your suggestion of a BSP technique... Thanks a lot! – user391045 Jul 14 '10 at 05:36
  • One part of case 3 might be that you have a data defined on an unstructured grid where a generated Delauney convex hull might not be appropriate. E.g. http://matplotlib.org/examples/pylab_examples/tripcolor_demo.html Then interpolating on the given triangular mesh might be good: http://matplotlib.org/api/tri_api.html – Dave X Jan 05 '17 at 15:44
0

There is a FORTRAN library called BIVAR, which is very suitable for this problem. With a few modifications you can make it usable in python using f2py.

From the description:

BIVAR is a FORTRAN90 library which interpolates scattered bivariate data, by Hiroshi Akima.

BIVAR accepts a set of (X,Y) data points scattered in 2D, with associated Z data values, and is able to construct a smooth interpolation function Z(X,Y), which agrees with the given data, and can be evaluated at other points in the plane.

Community
  • 1
  • 1
ye-ti-800
  • 218
  • 1
  • 9
0

I suggest you taking a look at GRASS (an open source GIS package) interpolation features (http://grass.ibiblio.org/gdp/html_grass62/v.surf.bspline.html). It's not in python but you can reimplement it or interface with C code.

Vitor Py
  • 5,145
  • 4
  • 39
  • 62
0

Am I right in thinking your data grids look something like this (red is the old data, blue is the new interpolated data)?

alt text http://www.geekops.co.uk/photos/0000-00-02%20%28Forum%20images%29/DataSeparation.png

This might be a slightly brute-force-ish approach, but what about rendering your existing data as a bitmap (opengl will do simple interpolation of colours for you with the right options configured and you could render the data as triangles which should be fairly fast). You could then sample pixels at the locations of the new points.

Alternatively, you could sort your first set of points spatially and then find the closest old points surrounding your new point and interpolate based on the distances to those points.

Jon Cage
  • 36,366
  • 38
  • 137
  • 215
  • Right idea with the grid, though I'm actually tracking the properties of a virtual particle as it travels through the mesh, so the blue dots should look more like a trail of breadcrumbs: ![mesh](http://i276.photobucket.com/albums/kk31/account321/DataSeparation.png) Hopefully that picture works. The image rendering idea is interesting -- I do have the PIL available, so I may give it a go. Thanks! – user391045 Jul 14 '10 at 05:34