1

I've got two pandas series, one with a 7 day rolling mean for the entire year and another with monthly averages. I'm trying to plot them both on the same matplotlib figure, with the averages as a bar graph and the 7 day rolling mean as a line graph. Ideally, the line would be graph on top of the bar graph.

The issue I'm having is that, with my current code, the bar graph is showing up without the line graph, but when I try plotting the line graph first I get a ValueError: ordinal must be >= 1.

Here's what the series' look like:

These are first 15 values of the 7 day rolling mean series, it has a date and a value for the entire year:

date
2016-01-01         NaN
2016-01-03         NaN
2016-01-04         NaN
2016-01-05         NaN
2016-01-06         NaN
2016-01-07         NaN
2016-01-08    0.088473
2016-01-09    0.099122
2016-01-10    0.086265
2016-01-11    0.084836
2016-01-12    0.076741
2016-01-13    0.070670
2016-01-14    0.079731
2016-01-15    0.079187
2016-01-16    0.076395

This is the entire monthly average series:

dt_month
2016-01-01    0.498323
2016-02-01    0.497795
2016-03-01    0.726562
2016-04-01    1.000000
2016-05-01    0.986411
2016-06-01    0.899849
2016-07-01    0.219171
2016-08-01    0.511247
2016-09-01    0.371673
2016-10-01    0.000000
2016-11-01    0.972478
2016-12-01    0.326921

Here's the code I'm using to try and plot them:

ax = series_one.plot(kind="bar", figsize=(20,2))
series_two.plot(ax=ax)
plt.show()

Here's the graph that generates:

The graph my code generates Any help is hugely appreciated! Also, advice on formatting this question and creating code to make two series for a minimum working example would be awesome.

Thanks!!

Kyle Frye
  • 111
  • 1
  • 11
  • 1
    Concerning the additional advice you're asking for, see ♠[How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples), in particular mind that a simple table does not allow to know the datatype of the printed values (e.g. int vs. string vs. Timestamp). – ImportanceOfBeingErnest Apr 21 '19 at 20:39

2 Answers2

1

The problem is that pandas bar plots are categorical (Bars are at subsequent integer positions). Since in your case the two series have a different number of elements, plotting the line graph in categorical coordinates is not really an option. What remains is to plot the bar graph in numerical coordinates as well. This is not possible with pandas, but is the default behaviour with matplotlib.

Below I shift the monthly dates by 15 days to the middle of the month to have nicely centered bars.

import matplotlib.pyplot as plt
import numpy as np; np.random.seed(42)
import pandas as pd

t1 = pd.date_range("2018-01-01", "2018-12-31", freq="D")
s1 = pd.Series(np.cumsum(np.random.randn(len(t1)))+14, index=t1)
s1[:6] = np.nan

t2 = pd.date_range("2018-01-01", "2018-12-31", freq="MS")
s2 = pd.Series(np.random.rand(len(t2))*15+5, index=t2)

# shift monthly data to middle of month
s2.index += pd.Timedelta('15 days')


fig, ax = plt.subplots()

ax.bar(s2.index, s2.values, width=14, alpha=0.3)
ax.plot(s1.index, s1.values)

plt.show()

enter image description here

ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712
0

The problem might be the two series' indices are of very different scales. You can use ax.twiny to plot them:

ax = series_one.plot(kind="bar", figsize=(20,2))
ax_tw = ax.twiny()
series_two.plot(ax=ax_tw)
plt.show()

Output:

enter image description here

Quang Hoang
  • 146,074
  • 10
  • 56
  • 74