16

I created a pandas dataframe from some value counts on particular calendar dates. Here is how I did it:

time_series = pd.DataFrame(df['Operation Date'].value_counts().reset_index())
time_series.columns = ['date', 'count']

Basically, it is two columns, the first "date" is a column with datetime.date objects and the second column, "count" are simply integer values. Now, I'd like to plot a scatter or a KDE to represent how the value changes over the calendar days.

But when I try:

time_series.plot(kind='kde')
plt.show()

I get a plot where the x-axis is from -50 to 150 as if it is parsing the datetime.date objects as integers somehow. Also, it is yielding two identical plots rather than just one.

Any idea how I can plot them and see the calendars day along the x-axis?

cottontail
  • 10,268
  • 18
  • 50
  • 51
guy
  • 1,021
  • 2
  • 16
  • 40

2 Answers2

39

you sure you got datetime? i just tried this and it worked fine:

df =    date    count
7   2012-06-11 16:51:32 1.0
3   2012-09-28 08:05:14 12.0
19  2012-10-01 18:01:47 4.0
2   2012-10-03 15:18:23 29.0
6   2012-12-22 19:50:43 4.0
1   2013-02-19 19:54:03 28.0
9   2013-02-28 16:08:40 17.0
12  2013-03-12 08:42:55 6.0
4   2013-04-04 05:27:27 6.0
17  2013-04-18 09:40:37 29.0
11  2013-05-17 16:34:51 22.0
5   2013-07-07 14:32:59 16.0
14  2013-10-22 06:56:29 13.0
13  2014-01-16 23:08:46 20.0
15  2014-02-25 00:49:26 10.0
18  2014-03-19 15:58:38 25.0
0   2014-03-31 05:53:28 16.0
16  2014-04-01 09:59:32 27.0
8   2014-04-27 12:07:41 17.0
10  2014-09-20 04:42:39 21.0

df = df.sort_values('date', ascending=True)
plt.plot(df['date'], df['count'])
plt.xticks(rotation='vertical')

enter image description here

EDIT:

if you want a scatter plot you can:

plt.plot(df['date'], df['count'], '*')
plt.xticks(rotation='vertical')

enter image description here

epattaro
  • 2,330
  • 1
  • 16
  • 29
  • 1
    Thanks that worked for some weird reason. Also, I removed the backslashes in your 2nd line of code, not sure why you included it.....thanks! – guy Jan 23 '17 at 21:29
  • 3
    I'm not sure if it's too late, but I'd want to know 'plt' stays for what? – SPS Sep 17 '18 at 10:42
  • 1
    import matplotlib.pyplot as plt – epattaro Sep 18 '18 at 00:03
  • This is very helpful and straight to the point. A very universal answer. – msarafzadeh Jun 30 '19 at 09:03
  • Is there a way to add color to this plot using another variable? – Christa Jan 14 '20 at 11:16
  • Hi christina, there is the color argument, see this post: https://stackoverflow.com/questions/22408237/named-colors-in-matplotlib , its also possible to plot multiple graphs in the same figure, each with its own color. For that, just run the plot command N times, each with its data and color parameters. – epattaro Jan 17 '20 at 12:35
0

If the column is datetime dtype (not object), then you can call plot() directly on the dataframe. You don't need to sort by date either, it's done behind the scenes if x-axis is datetime.

df['date'] = pd.to_datetime(df['date'])
df.plot(x='date', y='count', kind='scatter', rot='vertical');

res

You can also pass many arguments to make the plot nicer (add titles, change figsize and fontsize, rotate ticklabels, set subplots axis etc.) See the docs for full list of possible arguments.

df.plot(x='date', y='count', kind='line', rot=45, legend=None, 
        title='Count across time', xlabel='', fontsize=10, figsize=(12,4));

res2

You can even use another column to color scatter plots. In the example below, the months are used to assign color. Tip: To get the full list of possible colormaps, pass any gibberish string to colormap and the error message will show you the full list.

df.plot(x='date', y='count', kind='scatter', rot=90, c=df['date'].dt.month, colormap='tab20', sharex=False);

res4

cottontail
  • 10,268
  • 18
  • 50
  • 51