-2

I want to build a model that describes a curve that fits the data shown in the scatterplot. I thought it would be straight forward using sklearn. But the choice and application of the different methods gets rather confusing.

Which algorithms would you use to tackle this problem?

enter image description here

Benni
  • 795
  • 2
  • 7
  • 20
  • you want equations that creates similar (visulay identical) datasets or you want to fit a curve/polynomial through the middle of the plot or what exactly ? What is the input (image, point cloud, equations, ???) what is the output ? See [Trying to fit a sine function to phased light curve](https://stackoverflow.com/a/41208993/2521214) and [Curve fitting with y points on repeated x positions (Galaxy Spiral arms)](https://stackoverflow.com/a/35865478/2521214) – Spektre Sep 07 '18 at 08:36

1 Answers1

-1

This is really a question for CrossValidated rather than a Python question.

Your data seems to strongly indicate a simple underlying model which is linear until the very end, when it perhaps becomes polynomial.

As a first step, if possible, I would investigate this phenomenon. It's unusual. Perhaps there's something wrong with the data source. But maybe not. For example, a physical phenomenon with two distinct phases might produce data like these.

As to models, I would suggest natural cubic splines for this data. They are simple and involve cutting the data up into windows which you fit with cubic polynomials (a special case of which is a line).

You might also consider smoothing splines, and local regression.

For information on these, see the free online textbook, An Introduction to Statistical Learning.

Denziloe
  • 7,473
  • 3
  • 24
  • 34