1

I am new to data visualization and attempting to make a simple time series plot using an SQL output and seaborn. I am having difficulty inserting the data retrieved from the SQL query into Seaborn. Is there some direction you can give me on how to visualize this dataframe using Seaborn?

My Python Code:

#!/usr/local/bin/python3.5

import cx_Oracle
import pandas as pd
from IPython.display import display, HTML
import matplotlib.pyplot as plt
import seaborn as sns

orcl = cx_Oracle.connect('sql_user/sql_pass//sql_database_server.com:9999/SQL_REPORT')

sql = '''
select DATETIME, FRUIT,
COUNTS
from FRUITS.HEALTHY_FRUIT
WHERE DATETIME > '01-OCT-2016'
AND FRUIT = 'APPLE'
'''

curs = orcl.cursor()

df = pd.read_sql(sql, orcl)
display(df)

sns.kdeplot(df)
plt.show()

Dataframe (df) output:

    DATETIME  FRUIT  COUNTS
0 2016-10-02  APPLE  1.065757e+06
1 2016-10-03  APPLE  1.064369e+06
2 2016-10-04  APPLE  1.067552e+06
3 2016-10-05  APPLE  1.068010e+06
4 2016-10-06  APPLE  1.067118e+06
5 2016-10-07  APPLE  1.064925e+06
6 2016-10-08  APPLE  1.066576e+06
7 2016-10-09  APPLE  1.065982e+06
8 2016-10-10  APPLE  1.072131e+06
9 2016-10-11  APPLE  1.076429e+06

When I try to run plt.show() I get the following error:

TypeError: cannot astype a datetimelike from [datetime64[ns]] to [float64]
MBasith
  • 1,407
  • 4
  • 29
  • 48
  • what kind of plot do you want exactly? is there a reason you're passing the entire dataframe into `kdeplot()`? – benten Oct 13 '16 at 00:31
  • @benten Hi, I want a simple line graph with the DATETIME as the X-axis and Counts as Y-Axis. I'm not sure how to pass only the DATETIME and COUNTS into the dataframe. – MBasith Oct 13 '16 at 00:39

1 Answers1

3

Instead of sns.kdeplot try the following:

# make time the index (this will help with plot ticks)
df.set_index('DATETIME', inplace=True)

# make figure and axis objects
fig, ax = sns.plt.subplots(1, 1, figsize=(6,4))
df.plot(y='COUNTS', ax=ax, color='red', alpha=.6)
fig.savefig('test.pdf')
plt.show()

The function kdeplot() is not what you want if you're trying to make a line graph. It does make a line, but the line is intended to approximate the distribution of a variable rather than show how a variable changes over time. By far the easiest way to make a line plot is from pandas df.plot(). If you want the styling options of seaborn, you can use sns.plt.subplots to create your axis object (what I do). You can also use sns.set_style() like in this question.

Community
  • 1
  • 1
benten
  • 1,995
  • 2
  • 23
  • 38
  • That worked really well. I see the plot now. The only issue is that the X axis is not displaying the actual DATETIME just numbers 0-9. That may be the default bins? Is there a way I can correct that and is there a good online resource or book that I can get help with this on? – MBasith Oct 13 '16 at 00:55
  • 1
    I forgot to include `inplace=True` in the `set_index()` function. That's fixed so the code should give you nice x-axis labels. Plotting in pandas/seaborn is all done using via matplotlib, so when you're googling for help you'll want something like 'axis labels matplotlib pandas' and that should get you some helpful results. [Here's a good primer for plotting in pandas](http://pandas.pydata.org/pandas-docs/version/0.18.1/visualization.html) – benten Oct 13 '16 at 01:01
  • 1
    that looks so beautiful I want to reach out and give you a hug!. Thank you so much for your help. This gives me a really good start. Cheers. – MBasith Oct 13 '16 at 01:05