0

2 part question: through a bunch of cobbled together googling I managed to glue together the code in Python to take a log of a list and plot it against the original list, and apply a linear line of best fit (code below, synthetic data).

How would I go about printing the details for this linear fit (such as the gradient, y-intercept, chi squared) on the graph itself?

How would I modify the code to do the same for a polynomial fit, such as an x^2 line?

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from numpy import exp, loadtxt, pi, sqrt, random, linspace
from lmfit import Model
import glob, os

x=[2,3,5,7,11,13,17,19,23,29,31]
y=np.log10(x)
print(y)

plt.scatter(x, y, label="prime-logs", color="red",
            marker="1", s=50)


plt.xlabel('Primes')
plt.ylabel('Log10 Primes')

plt.title('Non-Log Plot of Log Prime v Prime')
plt.legend()

plt.plot(np.unique(x), np.poly1d(np.polyfit(x, y, 1))(np.unique(x)))

plt.show()
Epideme
  • 209
  • 3
  • 11
  • Check out the matplotlib [guide to annotations](https://matplotlib.org/3.1.1/tutorials/text/annotations.html). – mullinscr Sep 29 '20 at 22:25

1 Answers1

1

Here is a solution. Only the chi-squared are displayed here (as the sum-of-squares residuals). Note that there is no such thing as gradient descent with np.polyfit as the problem is a least-squares problem, which can be solved directly with SVD pseudo-inverse matrix calculation.

Using an algorithm that allows you to extract every steps of the inversion (with gradient descent or any other optimizer), will allow you to display each steps of the fit on the same figure.

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from numpy import exp, loadtxt, pi, sqrt, random, linspace

# from lmfit import Model
import glob, os

x = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31]
y = np.log10(x)
print(y)

plt.plot(x, y, "1", ms=8, label="prime-logs")


y1, r, *_ = np.polyfit(x, y, 1, full=True)
f1 = np.poly1d(y1)
plt.plot(x, f1(x), label=f"linear ($\chi^2$ = {r[0]:0.2f})")

y2, r, *_ = np.polyfit(x, y, 2, full=True)
f2 = np.poly1d(y2)
plt.plot(x, f2(x), label=f"quadratic ($\chi^2$ = {r[0]:0.2f})")

plt.xlabel("Primes")
plt.ylabel("Log10 Primes")
plt.title("Non-Log Plot of Log Prime v Prime")
plt.legend()

ployfit of log primes

Leonard
  • 2,510
  • 18
  • 37
  • Hi. Thank you, this is exactly what I was looking for. I don't think I would have ever got there without your answer. Can I ask what you mean by 'gradient descent'? Also if it is not too much trouble would you mind walking me through some of the methodology here, especially the 'y1, r, *_', the '{r[0]:0.2f}', and 'ms=8' and parts? – Epideme Oct 01 '20 at 14:48
  • As an aside, how would I get this to return constants for the lines. Before I managed to get to m, b = np.polyfit(x, y, 1) plt.plot(x, y, 'o') plt.plot(np.unique(x), np.poly1d(np.polyfit(x, y, 1))(np.unique(x))) print('m =', m) print('b =', b) but these quantities don't seem to exist in the model method you used? (hadn't found a way to print them to the graph itself yet, which I am guessing what the 'r[0]:0.2f}' bit you used was for? – Epideme Oct 01 '20 at 14:48
  • Also, which bit actually does the get the reduced chi^2 value? is that the 'r'? – Epideme Oct 01 '20 at 15:17
  • 1
    So, in the correct order: I used `y1, r, *_ = ...` in order to get only the two first output values of `np.polifit`, the remaining is "trashed" in the "_" variable. More about this trick [here](https://stackoverflow.com/questions/13726155/is-an-acceptable-way-to-ignore-arguments-in-python). The `ms=8` stands for `marker_size` (check the [Matplotlib plot doc](https://matplotlib.org/3.3.2/api/_as_gen/matplotlib.pyplot.plot.html) for more details) – Leonard Oct 01 '20 at 20:13
  • 1
    You can always recover the variables I hid in the `*_`variables (check what you can get out of `np.polyfit` in the [doc](https://numpy.org/doc/stable/reference/generated/numpy.polyfit.html), in the "returns" section. In my case, I considered the chi-squared value as the sum-of-squares (directly the `r` out of `np.polyfit`, that's it). Other definitions include normalizing by the degree of freedom (check [this other answer](https://stackoverflow.com/questions/5477359/chi-square-numpy-polyfit-numpy) for instance). – Leonard Oct 01 '20 at 20:18
  • 1
    Finally, the `f"chi = {r[0]:0.2}"` literally means, print a string that starts with "chi = " and print therein the first value of the residuals matrix `r[0]` (because actually `r` is a matrix, and the residuals are the off-diagonal terms). The `:0.2f` asks to print only 2 digits after decimal point in order to avoid super long strings display. – Leonard Oct 01 '20 at 20:21
  • Thank you. Very much appreciated. – Epideme Oct 06 '20 at 15:20