18

This is a follow-up question to my previous post: Python/Scipy Interpolation (map_coordinates)

Let's say I want to interpolate over a 2d rectangular area. My variable 'z' contains the data as shown below. Each column is at a constant value, however, each row of the array may be at a different value as shown in the comment below.

from scipy import interpolate
from numpy import array
import numpy as np
#                                               # 0.0000, 0.1750, 0.8170, 1.0000
z = array([[-2.2818,-2.2818,-0.9309,-0.9309],   # 0.0000, 0.0000, 0.0000, 0.0000
           [-2.2818,-2.2818,-0.9309,-0.9309],   # 0.2620, 0.2784, 0.3379, 0.3526
           [-1.4891,-1.4891,-0.5531,-0.5531],   # 0.6121, 0.6351, 0.7118, 0.7309
           [-1.4891,-1.4891,-0.5531,-0.5531]])  # 1.0000, 1.0000, 1.0000, 1.0000
# Rows, Columns = z.shape

cols = array([0.0000, 0.1750, 0.8170, 1.0000])
rows = array([0.0000, 0.2620, 0.6121, 1.0000])

sp = interpolate.RectBivariateSpline(rows, cols, z, kx=1, ky=1, s=0)

xi = np.array([0.00000, 0.26200, 0.27840, 0.33790, 0.35260, 0.61210, 0.63510,
               0.71180, 0.73090, 1.00000], dtype=np.float)
yi = np.array([0.000, 0.167, 0.815, 1.000], dtype=np.float)
print sp(xi, yi)

As another way of visualizing this, the array of values I KNOW would be:

rows = array([0.0000, 0.2620, 0.2784, 0.3379, 0.3526,
                      0.6121, 0.6351, 0.7118, 0.7309, 1.0000])
#          # 0.0000, 0.1750, 0.8170, 1.0000
z = array([[-2.2818,-2.2818,-0.9309,-0.9309],   # 0.0000
           [-2.2818,      ?,      ?,      ?],   # 0.2620,
           [      ?,-2.2818,      ?,      ?],   # 0.2784
           [      ?,      ?,-0.9309,      ?],   # 0.3379
           [      ?      ,?,      ?,-0.9309],   # 0.3526
           [-1.4891,      ?,      ?,      ?],   # 0.6121
           [      ?,-1.4891,      ?,      ?],   # 0.6351
           [      ?,      ?,-0.5531,      ?],   # 0.7118
           [      ?,      ?,      ?,-0.5531],   # 0.7309
           [-1.4891,-1.4891,-0.5531,-0.5531]])  # 1.0000

I do not know the '?' values, and they should be interpolated. I tried replacing them with None, but then get 'nan' for all of my results.

EDIT:

I think I need to use either 'griddata' or 'interp2'. griddata seems to produce the result I expect, but 'interp2' does not.

from scipy import interpolate
from numpy import array
import numpy as np

z = array([[-2.2818,-2.2818,-0.9309,-0.9309],
           [-2.2818,-2.2818,-0.9309,-0.9309],
           [-1.4891,-1.4891,-0.5531,-0.5531],
           [-1.4891,-1.4891,-0.5531,-0.5531]])

rows = array([0.0000, 0.0000, 0.0000, 0.0000,
              0.2620, 0.2784, 0.3379, 0.3526,
              0.6121, 0.6351, 0.7118, 0.7309,
              1.0000, 1.0000, 1.0000, 1.0000])

cols = array([0.0000, 0.1750, 0.8180, 1.0000,
              0.0000, 0.1750, 0.8180, 1.0000,
              0.0000, 0.1750, 0.8180, 1.0000,
              0.0000, 0.1750, 0.8180, 1.0000])

xi = array([0.0000, 0.2620, 0.2784, 0.3379, 0.3526, 0.6121, 0.6351, 0.7118,
               0.7309, 1.0000], dtype=np.float)
yi = array([0.000, 0.175, 0.818, 1.000], dtype=np.float)

GD = interpolate.griddata((rows, cols), z.ravel(),
                          (xi[None,:], yi[:,None]), method='linear')
I2 = interpolate.interp2d(rows, cols, z, kind='linear')

print GD.reshape(4, 10).T
print '\n'
print I2(xi, yi).reshape(4, 10).T

import matplotlib.pyplot as plt
import numpy.ma as ma

plt.figure()
GD = interpolate.griddata((rows.ravel(), cols.ravel()), z.ravel(),
                          (xi[None,:], yi[:,None]), method='linear')
CS = plt.contour(xi,yi,GD,15,linewidths=0.5,colors='k')
CS = plt.contourf(xi,yi,GD,15,cmap=plt.cm.jet)
plt.colorbar()
plt.scatter(rows,cols,marker='o',c='b',s=5)

plt.figure()
I2 = I2(xi, yi)
CS = plt.contour(xi,yi,I2,15,linewidths=0.5,colors='k')
CS = plt.contourf(xi,yi,I2,15,cmap=plt.cm.jet)
plt.colorbar()
plt.scatter(rows,cols,marker='o',c='b',s=5)
plt.show()
Community
  • 1
  • 1
Scott B
  • 2,542
  • 7
  • 30
  • 44
  • 1
    interp2 will not give the expected result on unstructured or non-rectilinear data. As it says in the documentation http://docs.scipy.org/doc/scipy/reference/interpolate.html it is for structured data. It even goes so far as to say it is for the even more restrictive regular (structured) grid. I'm unconvinced since the `__doc__` for `interp2d` has an example using a rectilinear (structured, not regular) grid. I elaborated on my answer to try to make this as clear as possible including clear definitions of structured, unstructured, and regular grids. – Paul Mar 02 '11 at 08:50

1 Answers1

19

Looks like you got it.

In your upper code example and in your previous (linked) question you have structured data. Which can be interpolated using RectBivariateSpline or interp2d. This means you have data that can be described on a grid (all points on the grid have a known value). The grid doesn't necessarily have to have all the same dx and dy. (if all dx's and dy's were equal, you'd have a Regular Grid)

Now, your current question asks what to do if not all the points are known. This is known as unstructured data. All you have are a selection of points in a field. You can't necessarily construct rectangles where all vertices have known values. For this type of data, you can use (as you have) griddata, or a flavor of BivariateSpline.

Now which to choose?

The nearest analogy to the structured RectBivariateSpline is one of the unstructured BivariateSpline classes: SmoothBivariateSpline or LSQBivariateSpline. If you want to use splines to interpolate the data, go with these. this makes your function smooth and differentiable, but you can get a surface that swings outside Z.max() or Z.min().

Since you are setting ky=1 and kx=1 and are getting what I am pretty sure is just linear interpolation on the structured data, I'd personally just switch from the RectBivariateSpline spline scheme to the interp2d structured grid interpolation scheme. I know the documentation says it is for regular grids, but the example in the __doc__ itself is only structured, not regular.

I'd be curious if you found any significant differences between the methods if you do end up switching. Welcome to SciPy.

Community
  • 1
  • 1
Paul
  • 42,322
  • 15
  • 106
  • 123
  • Is there an advantage to using interp2d instead of griddata? It seems that they basically do the same thing. I think the main difference is really just that for griddata, you supply the points you want the information about and get an array of the results. For the interp2d, I think you get back an interp class which you can then use to interpolate at whichever points. Is that essentially the only difference? – Scott B Feb 28 '11 at 23:40
  • I posted another answer that compares these two methods. – Scott B Mar 01 '11 at 21:20
  • I deleted my previous comment that was probably misleading. `interp2d` and `griddata` cannot always be used interchangeably. `interp2d` is for rectilinear data only (possibly only for regular data if you believe the documentation). `griddata` and `interp2d` should give the same result for these restrictive grids. A grid that has cells whose sides are not parallel should not be used with `interp2d`. It's bothersome that `interp2d` even allows the x and y arrays to be input having the same shape as z. – Paul Mar 02 '11 at 09:01
  • Thanks for all of the help Paul. It is clear now. I also find it disturbing that it does not raise some warning or error when feeding interp2d unstructured data. – Scott B Mar 02 '11 at 23:43
  • `interp2d` documentation is actually wrong --- it just uses BivariateSpline under the hood, and works also for scattered data to the extent that FITPACK works. However, the 2D knot fitting in Fitpack is a bit fiddly in practice and often fails... – pv. Mar 07 '11 at 19:22
  • @pv: Now I'm confused. My source says that interp2d uses fitpack.bisplrep which in turn uses FITPACK SURFIT. BivariateSpline appears to be superclass for SmoothBivariateSpline and LSQBivariateSpline, the family of which uses dfitpack.surfit_lsq and dfitpack.surfit_smth (the differences between dfitpack and fitpack I am unclear on). I guess your point is that interp2d can take unstructured data despite what the docs say, which appears true. (They all seem a variant of surfit.) Which begs the question: Why does the OP get unexpected results with interp2d? – Paul Mar 07 '11 at 20:35
  • @Paul: the example from the OP works for me. There are some minor differences in the interpolants, but that's just because interp2d is a tensor spline, and griddata a triagulation-based method. There are some data sets for which FITPACK 2d routines work badly (spit out warnings and produce bad interpolants), but this does not appear to be one. – pv. Mar 07 '11 at 21:19
  • 3
    How to deal with NaN values using the structured method, especially for large arrays? – regeirk Mar 18 '13 at 19:27