0

I have MSSQL stored procedure that returns the needed values. I have been able to connect and reference this dataset through PyCharm. However, I am struggling to create a line graph from this information.

Here is the dataset of the SQL code:

enter image description here

Here is my code:

import matplotlib.pyplot as plt
import pyodbc
import pandas as pd

server = 'SERVER'
database = 'DATABASE'
username = 'USERNAME'
password = 'PASSWORD'

sql_connect = pyodbc.connect(
    'DRIVER={SQL Server};SERVER=' + server + ';DATABASE=' + database + ';UID=' + username + ';PWD=' + password
)

data = pd.read_sql('EXECUTE [dbo].[stored_procedure]', sql_connect)

df = pd.DataFrame({
    '3000': data["3000"],
    '4000': data["4000"],
    '5000': data["5000"],
    '6000': data["6000"],
    '7000': data["7000"],
    '8000': data["8000"],
    '8200': data["8200"],
}, index=data.Hour)
chart = df.plot.line()

When I go back to review the dataset that it returns I get the following:

enter image description here

I thought maybe it was a datatype error and converted each column to an integer like this:

df= pd.DataFrame({
    '3000': int(data["3000"]),
    '4000': int(data["4000"]),
    '5000': int(data["5000"]),
    '6000': int(data["6000"]),
    '7000': int(data["7000"]),
    '8000': int(data["8000"]),
    '8200': int(data["8200"]),
}, index=data.Hour)
chart = df.plot.line()

It worked at first however, I ran it again later and it started to error out. I reviewed some other questions and tried to apply the solutions to this issue but to no success. Any idea's on how I can correct this issue?

UPDATE

I figured out that the "Hour" index is causing the "nan" data issues and have adjusted my code to the following now:

df = pd.DataFrame({
    '3000': data["3000"].astype('int'),
    '4000': data["4000"].astype('int'),
    '5000': data["5000"].astype('int'),
    '6000': data["6000"].astype('int'),
    '7000': data["7000"].astype('int'),
    '8000': data["8000"].astype('int'),
    '8200': data["8200"].astype('int'),
})
lines = df .plot.line(
    color={
        "3000": "#00FFFF",
        "4000": "#228B22",
        "5000": "#191970",
        "6000": "#CC5500",
        "7000": "#0D98BA",
        "8000": "#FFFF00",
        "8200": '#FF69B4'
    }
)
plt.show()
sql_connect.close()

That code is returning the following graph: enter image description here

The data within the "Hour" column is considered an "object" data type and should be the x-axis on this chart. However, when adding that to the index, it errors out. Any suggestions?

Thanks again for all the help!

eyllanesc
  • 235,170
  • 19
  • 170
  • 241
  • `DataFrame.plot.line()` does not work like that. Please refer to the documentation to see how it works: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.plot.line.html – SNR Dec 15 '21 at 22:05
  • I think you should just use `pd.melt` on `data` to get the data in a better format for plotting – Chris Dec 15 '21 at 22:06
  • Try to close your connection after pd.read_sql.\ See: https://stackoverflow.com/questions/3783238/python-database-connection-close – NassimH Dec 15 '21 at 22:07
  • @SNR You're right. I must have just left it in there as a mistake! Thanks for sharing! – TheSheepdog Dec 16 '21 at 12:47
  • @Chris I am not sure what that is so, I'll have to dig into that. Thanks! – TheSheepdog Dec 16 '21 at 12:48
  • @NassimH Thanks for the recommendation. I'll make sure that is placed in there as well. – TheSheepdog Dec 16 '21 at 12:49

1 Answers1

0

regarding your Index, you can convert it to a datetime object as follow:\

df.index = pd.to_datetime(df.index, format='%H:%M')

It should be displayed correctly in your graph

NassimH
  • 462
  • 3
  • 13