23

How to fit a locally weighted regression in python so that it can be used to predict on new data?

There is statsmodels.nonparametric.smoothers_lowess.lowess, but it returns the estimates only for the original data set; so it seems to only do fit and predict together, rather than separately as I expected.

scikit-learn always has a fit method that allows the object to be used later on new data with predict; but it doesn't implement lowess.

max
  • 49,282
  • 56
  • 208
  • 355
  • 6
    @JesseBakker It can certainly be used for prediction. https://stat.ethz.ch/R-manual/R-devel/library/stats/html/predict.loess.html. Also see http://stackoverflow.com/questions/12822069/loess-predict-with-new-x-values. – max Mar 27 '16 at 21:02
  • 3
    @JesseBakker, lowess is for fitting curves (as opposed to lines) using locally weighted least squares, which can help reveal trends. While not very commonly used for predicting, it can certainly be used for predicting. – Sarah Sep 20 '19 at 15:42
  • A [pull request](https://github.com/statsmodels/statsmodels/pull/6554#pullrequestreview-365170338) is currently under review which extends the LOWESS implementation for interpolation. It won't pick up the fit/predict semantics, though. – normanius Jun 19 '20 at 01:26

5 Answers5

18

Lowess works great for predicting (when combined with interpolation)! I think the code is pretty straightforward-- let me know if you have any questions! Matplolib Figure

import matplotlib.pyplot as plt
%matplotlib inline
from scipy.interpolate import interp1d
import statsmodels.api as sm

# introduce some floats in our x-values
x = list(range(3, 33)) + [3.2, 6.2]
y = [1,2,1,2,1,1,3,4,5,4,5,6,5,6,7,8,9,10,11,11,12,11,11,10,12,11,11,10,9,8,2,13]

# lowess will return our "smoothed" data with a y value for at every x-value
lowess = sm.nonparametric.lowess(y, x, frac=.3)

# unpack the lowess smoothed points to their values
lowess_x = list(zip(*lowess))[0]
lowess_y = list(zip(*lowess))[1]

# run scipy's interpolation. There is also extrapolation I believe
f = interp1d(lowess_x, lowess_y, bounds_error=False)

xnew = [i/10. for i in range(400)]

# this this generate y values for our xvalues by our interpolator
# it will MISS values outsite of the x window (less than 3, greater than 33)
# There might be a better approach, but you can run a for loop
#and if the value is out of the range, use f(min(lowess_x)) or f(max(lowess_x))
ynew = f(xnew)


plt.plot(x, y, 'o')
plt.plot(lowess_x, lowess_y, '*')
plt.plot(xnew, ynew, '-')
plt.show()
Daniel Hitchcock
  • 683
  • 1
  • 6
  • 11
  • 12
    This would use linear interpolation. While it's not unreasonable, it's not really the same as "predicting using lowess". Lowess is defined as a weighted linear regression on a subset of the training points. The prediction it would make for a new point should be based on the result of that regression, rather than on predicting for two nearby points of the training set and then connecting them with a line. For a dense dataset, the difference is trivial, of course. Points outside the range should also be predicted with a weighted LR on the corresponding neighborhood) rather than a fixed value. – max May 06 '16 at 04:24
  • @max Just came across this question with a similar problem. While sklearn doesn't implement LOESS, it has a RANSAC implementation, which seems similar enough to my untrained eyes. Hope this is useful to someone: http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.RANSACRegressor.html – Aleksander Lidtke Dec 30 '16 at 18:25
  • @max it's not unreasonable at all and I've been using a similar approach to scale metabolomics data in a non-parametric fashion for some time. I scale points outside of the range to either the max or min of the LOWESS curve, and do a linear interpolation for everything else. If there are not enough points for a proper linear interpolation, then there are not enough points for a proper LOWESS curve in my opinion. Another note, I've been using the R library for LOWESS over the Python library. The python library has a few problems with edge effects that I couldn't reconcile. Gotta love RPy2 – Daniel Hitchcock Jan 25 '17 at 19:24
  • 2
    @DanielHitchcock `If there are not enough points for a proper linear interpolation, then there are not enough points for a proper LOWESS curve in my opinion` -- I'm sympathetic to your argument, but that's because of my personal bias in favor of super simple techniques. I certainly wouldn't try convince the many data scientists who use LOWESS that they should abandon it in favor of linear interpolation. I didn't downvote your answer, but I can see how SO users might consider it to be not answering my original question. – max Jan 25 '17 at 22:20
  • @AleksanderLidtke RANSAC might be usable in situations similar to when LOWESS (or indeed linear interpolation) is usable. But it's certainly a very different algorithm (it's non-deterministic; it is less sensitive to outliers since it's trying to remove them). – max Jan 25 '17 at 22:24
  • Little OT, but I'll check out RANSAC for my data normalization needs as well! – Daniel Hitchcock Jan 26 '17 at 16:54
  • @max `untrained eyes` of mine just learned something, thanks ;) – Aleksander Lidtke Feb 06 '17 at 01:30
  • @max this uses LOWESS halfway in that it interpolates the y-values based on y predicted from LOWESS (i.e., based on the result of that regression). But, only those corresponding to x-values in the training set. – ijoseph Jul 18 '19 at 21:06
  • @max I agree, though, that there should be a way to directly use whatever local regression estimates were used. – ijoseph Jul 18 '19 at 21:07
7

I've created a module called moepy that provides an sklearn-like API for a LOWESS model (incl. fit/predict). This enables predictions to be made using the underlying local regression models, rather than the interpolation method described in the other answers. A minimalist example is shown below.

# Imports
import numpy as np
import matplotlib.pyplot as plt
from moepy import lowess

# Data generation
x = np.linspace(0, 5, num=150)
y = np.sin(x) + (np.random.normal(size=len(x)))/10

# Model fitting
lowess_model = lowess.Lowess()
lowess_model.fit(x, y)

# Model prediction
x_pred = np.linspace(0, 5, 26)
y_pred = lowess_model.predict(x_pred)

# Plotting
plt.plot(x_pred, y_pred, '--', label='LOWESS', color='k', zorder=3)
plt.scatter(x, y, label='Noisy Sin Wave', color='C1', s=5, zorder=1)
plt.legend(frameon=False)

enter image description here

A more detailed guide on how to use the model (as well as its confidence and prediction interval variants) can be found here.

Ayrton Bourn
  • 365
  • 5
  • 16
4

Consider using Kernel Regression instead.

statmodels has an implementation.

If you have too many data points, why not use sk.learn's radiusNeighborRegression and specify a tricube weighting function?

David R
  • 994
  • 1
  • 11
  • 27
  • 2
    @David_R, if you provided more clarity of what you meant (actually showed your implementation), this answer would be outstanding. Just a suggestion. – benjaminmgross Nov 14 '17 at 00:33
  • @benjaminmgross, thanks for the note. Maybe I'll find some time to elaborate later this week or this weekend. – David R Nov 15 '17 at 01:29
1

Check out the loess class in scikit-misc. The fitted object has a predict method:

loess_fit = loess(x, y, span=.01);
loess_fit.fit();
preds = loess_fit.predict(x_new).values

https://has2k1.github.io/scikit-misc/stable/generated/skmisc.loess.loess.html

Sealander
  • 3,467
  • 4
  • 19
  • 19
0

It's not clear whether it's a good idea to have a dedicated LOESS object with separate fit/predict methods like what is commonly found in Scikit-Learn. By contrast, for neural networks, you could have an object which stores only a relatively small set of weights. The fit method would then optimize the "few" weights by using a very large training dataset. The predict method only needs the weights to make new predictions, and not the entire training set.

Predictions based on LOESS and nearest neighbors, on the other hand, need the entire training set to make new predictions. The only thing a fit method could do is store the training set in the object for later use. If x and y are the training data, and x0 are the points at which to make new predictions, this object-oriented fit/predict solution would look something like the following:

model = Loess()
model.fit(x, y)         # No calculations. Just store x and y in model.
y0 = model.predict(x0)  # Uses x and y just stored.

By comparison, in my localreg library, I opted for simplicity:

y0 = localreg(x, y, x0)

It really comes down to design choices, as the performance would be the same. One advantage of the fit/predict approach is that you could have a unified interface like they do in Scikit-Learn, where one model could easily be swapped by another. The fit/predict approach also encourages a machine learning way to think of it, but in that sense LOESS is not very efficient, since it requires storing and using all the data for every new prediction. The latter approach leans more towards the origins of LOESS as a scatterplot smoothing algorithm, which is how I prefer to think about it. This might also shed some light on why statsmodel do it the way they do.

sigvaldm
  • 564
  • 4
  • 15