1

My dataframe has two colums, say x and y. The graph below shows the scatter plot of x and y.

Based on the scatter plot, I make a linear fitting using the following code which results in the blue straight line in the following image.

enter image description here

fig, ax = plt.subplots(nrows=1, ncols=1)
ax.scatter(df['x'], df['y'])
b, m = polyfit(df['x'], df['y'], 1)
ax.plot(df['x'], b + m * df['x'], 'blue', linewidth=1)

Now, I want to make another fitting curve, maybe polynomial, of the scatter plot. The wanted result is something like the red curve in the above image. I tried using the following from here.

coefs = np.polyfit(df['x'], df['y'], 2)
p = np.poly1d(coefs)
plt.plot(df['x'], df['y'], "bo", markersize= 2)
plt.plot(df['x'], p(df['x']), "r-")

But the result is incorrect for my data as shown below.
enter image description here
How should I proceed?
Edit: The data is here.

k.ko3n
  • 954
  • 8
  • 26
  • 2
    fit is created based on data spread [few points outside the main grouping adding miscalculations]. You can delete all points outside the mainstream data, and try again. Or you can use weights to apply to the y-coordinates of the sample points – Zaraki Kenpachi Jun 06 '19 at 17:41
  • 1
    @Zaraki Kenpachi. I don't mind to include outliers with risk. The problem is, as you can see, the curve is not single, but multiple lines back and forth. – k.ko3n Jun 06 '19 at 17:45
  • post link to you data, then i try to find solution – Zaraki Kenpachi Jun 06 '19 at 17:46
  • please see Edit – k.ko3n Jun 06 '19 at 17:56
  • 1
    see this: https://stackoverflow.com/questions/42998607/large-dataset-polynomial-fitting-using-numpy – Zaraki Kenpachi Jun 06 '19 at 18:18
  • 1
    The problem lies in the sorting, check the answer below. For linear fit, you don't see it because going back and forth among the points on a straight line is still a straight line path – Sheldore Jun 06 '19 at 20:14

1 Answers1

0

Based on the link posted by Zaraki Kenpachi, try adding df = df.sort_values(by='x') before your existing fitting and plotting code.

Peter Leimbigler
  • 10,775
  • 1
  • 23
  • 37