0

I have numpy.array data set from a simulation, but I'm missing the point at the edge (x=0.1), how can I interpolate/extrapolate the data in z to the edge? I have:

x = [ 0.  0.00667  0.02692  0.05385  0.08077]
y = [ 0.     10.     20.     30.     40.     50.]

#       0.      0.00667 0.02692 0.05385 0.08077 
z = [[ 25.     25.     25.     25.     25.   ]   # 0.
     [ 25.301  25.368  25.617  26.089  26.787]   # 10.
     [ 25.955  26.094  26.601  27.531  28.861]   # 20.
     [ 26.915  27.126  27.887  29.241  31.113]   # 30.
     [ 28.106  28.386  29.378  31.097  33.402]   # 40.
     [ 29.443  29.784  30.973  32.982  35.603]]  # 50.

I want to add a new column in z corresponding to x = 0.1 so that my new x will be

x_new = [ 0.  0.00667  0.02692  0.05385  0.08077  0.1]

#       0.      0.00667 0.02692 0.05385 0.08077 0.01
z = [[ 25.     25.     25.     25.     25.       ?   ]   # 0.
     [ 25.301  25.368  25.617  26.089  26.787    ?   ]   # 10.
     [ 25.955  26.094  26.601  27.531  28.861    ?   ]   # 20.
     [ 26.915  27.126  27.887  29.241  31.113    ?   ]   # 30.
     [ 28.106  28.386  29.378  31.097  33.402    ?   ]   # 40.
     [ 29.443  29.784  30.973  32.982  35.603    ?   ]]  # 50.

Where all '?' replaced with interpolated/extrapolated data. Thanks for any help!

  • The extrapolated value would depend on whether the relation of the progressive elements in the sequence are linear or logarithmic. If the former, you could just multiply the penultimate column by your constant, if the latter, then you'd have to look at the progression between columns and determine a suitable value. Also, this is definitely extrapolation. – Andrew Sep 14 '16 at 14:53

1 Answers1

2

Have you had a look at scipy.interpolate2d.interp2d (which uses splines)?

from scipy.interpolate import interp2d
fspline = interp2d(x,y,z) # maybe need to switch x and y around
znew = fspline([0.1], y)
z = np.c_[[z, znew] # to join arrays

EDIT:

The method that @dnalow and I are imagining is along the following lines:

import numpy as np
import matplotlib.pyplot as plt

# make some test data
def func(x, y):
    return np.sin(np.pi*x) + np.sin(np.pi*y)

xx, yy = np.mgrid[0:2:20j, 0:2:20j]
zz = func(xx[:], yy[:]).reshape(xx.shape)

fig, (ax1, ax2, ax3, ax4) = plt.subplots(1,4, figsize=(13, 3))
ax1.imshow(zz, interpolation='nearest')
ax1.set_title('Original')

# remove last column
zz[:,-1] = np.nan
ax2.imshow(zz, interpolation='nearest')
ax2.set_title('Missing data')

# compute missing column using simplest imaginable model: first order Taylor
gxx, gyy = np.gradient(zz[:, :-1])
zz[:, -1] =  zz[:, -2] + gxx[:, -1] + gyy[:,-1]
ax3.imshow(zz, interpolation='nearest')
ax3.set_title('1st order Taylor approx')

# add curvature to estimate
ggxx, _ = np.gradient(gxx)
_, ggyy = np.gradient(gyy)
zz[:, -1] = zz[:, -2] + gxx[:, -1] + gyy[:,-1] + ggxx[:,-1] + ggyy[:, -1]
ax4.imshow(zz, interpolation='nearest')
ax4.set_title('2nd order Taylor approx')

fig.tight_layout()
fig.savefig('extrapolate_2d.png')

plt.show()

enter image description here

You could improve the estimate by
(a) adding higher order derivatives (aka Taylor expansion), or
(b) computing the gradients in more directions than just x and y (and then weighting the gradients accordingly).

Also, you will get better gradients if you pre-smooth the image (and now we have a complete Sobel filter...).

Paul Brodersen
  • 11,221
  • 21
  • 38
  • I tried your suggestion but what happens is that it just takes the last value in each row and duplicates it into another column, probably because I try to go outside the original data and interp2d only works in between values? – Filip Andersson Sep 14 '16 at 15:40
  • Since this is extrapolation, interpolation cannot help. Certainly not interp2d. Furthermore, extrapolation only makes sense if a model for the data exists on whose basis one can extrapolate. If not, it is not clear how to obtain the best values. Therefore, this is more than just a numerical problem. – dnalow Sep 14 '16 at 16:01
  • Yeah, should have remembered that interp2d evaluates to nearest value at the edges (or nan). I think @dnalow is right: you need to compute some local gradients, maybe even the curvature and make an explicit model. – Paul Brodersen Sep 14 '16 at 16:17
  • In principle, if nothing is known, the nearest value is actually the most reliable one. Therefore I think the answer deserves +1. Assume some 'nice' behavior of the function, one can use a polynomial approximation to continue the data similar to a Taylor expansion: http://stackoverflow.com/questions/33964913/equivalent-of-polyfit-for-a-2d-polynomial-in-python http://stackoverflow.com/questions/7997152/python-3d-polynomial-surface-fit-order-dependent – dnalow Sep 14 '16 at 16:27
  • Thank you Paul for an excellent explanation! That is exactly what i need(, except only in one dimension for this particular case). And thank you dnalow for your interesting comments, I'll keep that in mind! Fortunately the data represents a physical phenomena with 'smooth' gradients, so I believe Paul's solution works for me in this case. – Filip Andersson Sep 14 '16 at 18:24