Interpolating is easy in pandas using df.interpolate()
is there a method in pandas that with the same elegance do something like extrapolate. I know my extrapolation is fitted to a second degree polynom.
Asked
Active
Viewed 2,570 times
7

DeanLa
- 1,871
- 3
- 21
- 37
-
You may have to use `scipy.interpolate.UnivariateSpline` which has an `ext` option. – askewchan Sep 17 '15 at 15:57
-
Related: [Extrapolate values in Pandas DataFrame](https://stackoverflow.com/questions/22491628/extrapolate-values-in-pandas-dataframe), but a simpler case that was able to be solved by another method. – askewchan Sep 17 '15 at 16:01
-
There is now an [answer](http://stackoverflow.com/a/35959909/2087463) to that question with specifics on the polynomial extrapolation. – tmthydvnprt Mar 12 '16 at 17:21
2 Answers
2
"With the same elegance" is a somewhat tall order but this can be done. As far as I'm aware you'll need to compute the extrapolated values manually. Note it is very unlikely these values will be very meaningful unless the data you are operating on actually obey a law of the form of the interpolant.
For example, since you requested a second degree polynomial fit:
import numpy as np
t = df["time"]
dat = df["data"]
p = np.poly1d(np.polyfit(t,data,2))
Now p(t) is the value of the best-fit polynomial at time t.

AGML
- 890
- 6
- 18
0
Extrapolation
See this answer for how to extrapolate the values of each column of a DataFrame
with a 3rd order polynomial. A different order (e.g. 2nd order) polynomial may easily be used by altering func()
.
Snippet from the answer
# Function to curve fit to the data def func(x, a, b, c, d): return a * (x ** 3) + b * (x ** 2) + c * x + d # Initial parameter guess, just to kick off the optimization guess = (0.5, 0.5, 0.5, 0.5) # Create copy of data to remove NaNs for curve fitting fit_df = df.dropna() # Place to store function parameters for each column col_params = {} # Curve fit each column for col in fit_df.columns: # Get x & y x = fit_df.index.astype(float).values y = fit_df[col].values # Curve fit column and get curve parameters params = curve_fit(func, x, y, guess) # Store optimized parameters col_params[col] = params[0] # Extrapolate each column for col in df.columns: # Get the index values for NaNs in the column x = df[pd.isnull(df[col])].index.astype(float).values # Extrapolate those points with the fitted function df[col][x] = func(x, *col_params[col])

Community
- 1
- 1

tmthydvnprt
- 10,398
- 8
- 52
- 72