0

I am trying to do linear regression between GDP and Count columns on data set below:

enter image description here

Here is my code:

CT = slav_df.iloc[:, 4]
GDP = slav_df.iloc[:,3]
t_slope, t_int, t_r, t_p, t_std_err = 
stats.linregress(GDP, CT)
talent_fit = t_slope * CT + t_int

#plotting 
fig, (ax1) = plt.subplots(1, sharex=True)
fig.suptitle(" ", fontsize=16, fontweight="bold")
ax1.set_xlim(min(CT), max(CT))
ax1.plot(GDP, CT, linewidth=1, marker="o")
ax1.plot(GDP, talent_fit, "b--", linewidth=1)
ax1.set_ylabel(" ")

When I run this I am getting a graph like this:

enter image description here

This doesn't seem to be right and I am wondering if I am doing something wrong here?

Sheldore
  • 37,862
  • 7
  • 57
  • 71
Slavisha
  • 219
  • 4
  • 16
  • If `GDP` is not sorted `plt` will plot lines back and forth: I'm guessing that could be your problem. Try sorting the data, or even simpler, replace `ax1.plot(GDP, talent_fit, "b--", linewidth=1)` with `ax1.plot(GDP[GDP.argsort()], talent_fit[GDP.argsort()], "b--", linewidth=1)` – Tarifazo Jan 30 '19 at 13:10
  • Also, you should check out [difference-in-plotting-with-different-matplotlib-versions](https://stackoverflow.com/questions/47155569/difference-in-plotting-with-different-matplotlib-versions) – Sheldore Jan 30 '19 at 13:15

0 Answers0