0

I am trying to plot a trendline for a matplotlib scatterplot and am uncertain why the trendline is not appearing. What should I change in my code to make the trendline appear? Event is a categorical data type.

I've followed what most other stackoverflow questions suggest about plotting a trendline, but am uncertain why my trendline is not appearing.

#import libraries
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from pandas.plotting import register_matplotlib_converters

#register datetime converters
register_matplotlib_converters()

#read dataset using pandas
dataset = pd.read_csv("UsrNonCallCDCEvents_CDCEventType.csv")


#convert date to datetime type
dataset['Interval'] = pd.to_datetime(dataset['Interval'])

#convert other columns to numeric type
for cols in list(dataset):
    if cols != 'Interval' and cols != 'CDCEventType':
        dataset[cols] = pd.to_numeric(dataset[cols])

#create pivot of dataset
pivot_dataset = dataset.pivot(index='Interval',columns='CDCEventType',values='AvgWeight(B)')

#create scatterplot with trendline
x = pivot_dataset.index.values.astype('float64')
y = pivot_dataset['J-STD-025']
plt.scatter(x,y)
z = np.polyfit(x,y,1)
p = np.poly1d(z)
plt.plot(x,p(x),"r--")
plt.show()

This is the graph currently being output. I am trying to get this same graph, but with a trendline: https://i.stack.imgur.com/8qBle.jpg It's also fine that x axis is not showing dates

A snippet of my dataframe looks like this: https://imgur.com/a/xJAcgEI I've painted out the irrelvant column names

starch
  • 17
  • 1
  • 7
  • Please do not provide data as an image file. Provide mockup data, in particular check out [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – Diziet Asahi May 15 '19 at 19:41
  • The plotting code is okay. It looks like there are lots of NaNs in your dataframe... are you getting an error about least squares not converging? Try doing `dropna` on your data and trying again. – Matt Hall May 15 '19 at 19:49
  • Could you perhaps edit your question and let us know what the result of `print(p)` is? – Asmus May 15 '19 at 19:56
  • 1
    I dropped all rows containing "NaN" and it ended up working, thank you kwinkunks! – starch May 15 '19 at 20:02

0 Answers0