1

My aim is to interpolate meteorological data from neighboring meteorological stations into the point with exact coordinates. In SciPy docs I found information about multidimensional interpolation ( from scipy.interpolate import griddata ). But honestly, I didn't understand how to make this code sutable for my task.

I have df with coordinates of stations and values of atmospheric pressure as input (and coordinates of a place without station)

Could anyone help me with this issue, please?

(https://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html#id4) - Multivariate data interpolation (griddata)ΒΆ

Misha
  • 21
  • 1
  • 4

2 Answers2

1

From what I gather, SciPy's griddata is great when you have a number of values that you want to interpolate between on a grid (as its name suggests), but if you have only one point that you would like to obtain a value for I would recommend using SciPy's interpolate.interp2d function (see docs). In either case, you'll still set up a 2D grid to derive an interpolation function, but I think it's easier to use the latter approach to get what you want.

Here's a quick example that uses the kind of data I imagine you have.

from scipy.interpolate import interp2d
import numpy as np
import pandas as pd

# Set up your dataframe with station data (you might have lon and lat instead of x and y)
df = pd.DataFrame({'x':[1,2,1,2], 'y':[1,1,2,2], 'Pa':[10,10,20,20]})

# Create your linearly-spaced 2D grid, the higher num_pts the higher the resolution 
num_pts = 10
x_grid = np.linspace(min(df['x']), max(df['x']), num_pts)
y_grid = np.linspace(min(df['y']), max(df['y']), num_pts)

# Set up your interpolation function (I would recommend linear interpolation 
# for your problem, but you can change this if you want with the 'kind' parameter)
f = interp2d(df['x'], df['y'], df['Pa'], kind='linear')

# Let's say we want to interpolate "Pa" values at:
x_prime, y_prime = (1.5, 1.5)

# Just plug the new coords into your function
pa_new = f(x_prime, y_prime)
pa_new 

>>> array([15.])

To give you an idea of what griddata could do for you:

from scipy.interpolate import griddata

X,Y = np.meshgrid(x_grid, y_grid)

Z = griddata([(x,y) for x,y in zip(df['x'],df['y'])], df['pa'], (X, Y), method='linear')

plt.subplot(111)
plt.scatter(df['x'], df['y'], 100, 'r', edgecolor='w')
plt.imshow(Z, extent=(min(df['x']-0.1),max(df['x']+0.1),min(df['y']-0.1),max(df['y']+0.1)), origin='lower')
plt.colorbar()

griddata interpolated grid values

Z
>>> array([[10.        , 10.        , 10.        , 10.        , 10.        ,
        10.        , 10.        , 10.        , 10.        , 10.        ],
       [11.11111111, 11.11111111, 11.11111111, 11.11111111, 11.11111111,
        11.11111111, 11.11111111, 11.11111111, 11.11111111, 11.11111111],
       [12.22222222, 12.22222222, 12.22222222, 12.22222222, 12.22222222,
        12.22222222, 12.22222222, 12.22222222, 12.22222222, 12.22222222],
       [13.33333333, 13.33333333, 13.33333333, 13.33333333, 13.33333333,
        13.33333333, 13.33333333, 13.33333333, 13.33333333, 13.33333333],
       [14.44444444, 14.44444444, 14.44444444, 14.44444444, 14.44444444,
        14.44444444, 14.44444444, 14.44444444, 14.44444444, 14.44444444],
       [15.55555556, 15.55555556, 15.55555556, 15.55555556, 15.55555556,
        15.55555556, 15.55555556, 15.55555556, 15.55555556, 15.55555556],
       [16.66666667, 16.66666667, 16.66666667, 16.66666667, 16.66666667,
        16.66666667, 16.66666667, 16.66666667, 16.66666667, 16.66666667],
       [17.77777778, 17.77777778, 17.77777778, 17.77777778, 17.77777778,
        17.77777778, 17.77777778, 17.77777778, 17.77777778, 17.77777778],
       [18.88888889, 18.88888889, 18.88888889, 18.88888889, 18.88888889,
        18.88888889, 18.88888889, 18.88888889, 18.88888889, 18.88888889],
       [20.        , 20.        , 20.        , 20.        , 20.        ,
        20.        , 20.        , 20.        , 20.        , 20.        ]])

user6386471
  • 1,203
  • 1
  • 8
  • 17
  • As your task is a geospatial one, you will probably have to do some manipulation to effectively project your coordinates to the x,y plane with cartesian coordinates (a 2D grid rather than geographic lat,lon coordinates on a curved surface) to use the approach above. – user6386471 Nov 17 '20 at 22:36
  • works well with 4 points, thanks. But I have only 2 (In best case 3) `df = pd.DataFrame({'x':[51.7,51.05], 'y':[39.216,39.7], 'Pa':[998.4,993.4]}) num_pts = 10 x_grid = np.linspace(min(df['x']), max(df['x']), num_pts) y_grid = np.linspace(min(df['y']), max(df['y']), num_pts) f = interp2d(df['x'], df['y'], df['Pa'], kind='linear') x_prime, y_prime = (51.216, 39.166) pa_new = f(x_prime, y_prime) pa_new ` rise an error (TypeError: m >= (kx+1)(ky+1) must hold) And what's worse - I'll probably need an opportunity to extarpolate out of bounds of min and max values of df – Misha Nov 18 '20 at 08:08
  • Yeah that's a bit of a problem - to do 2D linear interpolation you'll need a minimum of 3 points. From the [SciPy documentation](https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.interp2d.html): `The minimum number of data points required along the interpolation axis is (k+1)**2, with k=1 for linear, k=3 for cubic ... interpolation.` It's worth being aware that too few points will make any model a bit shady at best, particularly when extrapolating. Is it possible to get more data? – user6386471 Nov 18 '20 at 12:51
1

Found solution here:

Inverse Distance Weighted (IDW) Interpolation with Python

IDW interpolation is more than enough in my case, but @user6386471, thanks for your contribution!

def linear_rbf(x, y, z, xi, yi):
dist = distance_matrix(x,y, xi,yi)

# Mutual pariwise distances between observations
internal_dist = distance_matrix(x,y, x,y)

# Now solve for the weights such that mistfit at the observations is minimized
weights = np.linalg.solve(internal_dist, z)

# Multiply the weights for each interpolated point by the distances
zi =  np.dot(dist.T, weights)
return zi

(Using the distance_matrix function here:)

def distance_matrix(x0, y0, x1, y1):
obs = np.vstack((x0, y0)).T
interp = np.vstack((x1, y1)).T

# Make a distance matrix between pairwise observations
# Note: from <http://stackoverflow.com/questions/1871536>
# (Yay for ufuncs!)
d0 = np.subtract.outer(obs[:,0], interp[:,0])
d1 = np.subtract.outer(obs[:,1], interp[:,1])

return np.hypot(d0, d1)
Misha
  • 21
  • 1
  • 4