0

I have a pandas dataframe as shown in the figure below which has index as yyyy-mm, US recession period (USREC) and timeseries varaible M1. Please see table below

Date    USREC   M1
2000-12     1088.4
2001-01     1095.08
2001-02     1100.58
2001-03     1108.1
2001-04 1   1116.36
2001-05 1   1117.8
2001-06 1   1125.45
2001-07 1   1137.46
2001-08 1   1147.7
2001-09 1   1207.6
2001-10 1   1166.64
2001-11 1   1169.7
2001-12     1182.46
2002-01     1190.82
2002-02     1190.43
2002-03     1194.85
2002-04     1186.82
2002-05     1186.9
2002-06     1194.55
2002-07     1199.26
2002-08     1183.7
2002-09     1197.1
2002-10     1203.47

I want to plot a chart in python that looks like the attached chart which was created in excel.Chart.

I have searched for various examples online, but none are able to show the chart like below. Can you please help? Thank you.

I would appreciate if there is any easier to use plotting library which has few inputs but easy to use for majority of plots similar to plots excel provides.

EDIT: I checked out the example in the page https://matplotlib.org/examples/pylab_examples/axhspan_demo.html. The code I have used is below.

fig, axes = plt.subplots()
df['M1'].plot(ax=axes)
ax.axvspan(['USREC'],color='grey',alpha=0.5)

So I didnt see in any of the examples in the matplotlib.org webpage where I can input another column as axvspan range. In my code above I get the error

TypeError: axvspan() missing 1 required positional argument: 'xmax'
Zenvega
  • 1,974
  • 9
  • 28
  • 45
  • Try to include actual code to obtain the dataframe so people can help out in a better way. Also, open questions is not the best way to ask in SO, please include what have you tried out. – Diego Aguado Dec 24 '17 at 18:12
  • Adding the code to obtain the dataframe makes things easier for people trying to replicate your code. If you are looking for another plotting library, specially using pandas dataframes, take a look at plotly: https://plot.ly/python/line-charts/ – Diego Aguado Dec 24 '17 at 18:40
  • Thanks for suggestion! I have added raw values so that it is easy to replicate. I am also researching bokeh to see if it is suitable or not. – Zenvega Dec 24 '17 at 18:53

2 Answers2

0

I figured it out. I created secondary Y axis for USREC and hid the axis label just like I wanted to, but it also hid the USREC from the legend. But that is a minor thing.enter image description here

def plot_var(y1):
    fig0, ax0 = plt.subplots()
    ax1 = ax0.twinx()

    y1.plot(kind='line', stacked=False, ax=ax0, color='blue')
    df['USREC'].plot(kind='area', secondary_y=True, ax=ax1, alpha=.2, color='grey')
    ax0.legend(loc='upper left')
    ax1.legend(loc='upper left')
    plt.ylim(ymax=0.8)
    plt.axis('off')
    plt.xlabel('Date')
    plt.show()
    plt.close()

plot_var(df['M1'])
Zenvega
  • 1,974
  • 9
  • 28
  • 45
  • Is there a way to do this but without the triangular shape of the shaded bars? @zenvega – Merv Merzoug Oct 16 '18 at 13:20
  • @Zenvega. This answer doesn't work, as clearly shown by your screenshot. The lines should be vertical. Maybe check this answer: https://stackoverflow.com/questions/65344945/how-to-create-a-plot-with-vertical-shades-in-matplotlib – PatrickT Jan 16 '22 at 09:54
0

There is a problem with Zenvega's answer: The recession lines are not vertical, as they should be. What exactly goes wrong, I am not entirely sure, but I show below how to get vertical lines.

My answer uses the following syntax ax.fill_between(date_index, y1=ymin, y2=ymax, where=True/False), where I compute the y1 and y2 arguments manually from the axis object and where the where argument takes the recession data as a boolean of True or False values.

import pandas as pd
import matplotlib.pyplot as plt

#  get data: see further down for `string_data`
df = pd.read_csv(string_data, skipinitialspace=True)
df['Date'] = pd.to_datetime(df['Date'])

# convenience function
def plot_series(ax, df, index='Date', cols=['M1'], area='USREC'):
    # convert area variable to boolean
    df[area] = df[area].astype(int).astype(bool)
    # set up an index based on date
    df = df.set_index(keys=index, drop=False)
    # line plot
    df.plot(ax=ax, x=index, y=cols, color='blue')
    # extract limits
    y1, y2 = ax.get_ylim()
    ax.fill_between(df[index].index, y1=y1, y2=y2, where=df[area], facecolor='grey', alpha=0.4)
    return ax

# set up figure, axis
f, ax = plt.subplots()
plot_series(ax, df)
ax.grid(True)
plt.show()

# copy-pasted data from OP
from io import StringIO
string_data=StringIO("""
    Date,USREC,M1
    2000-12,0,1088.4
    2001-01,0,1095.08
    2001-02,0,1100.58
    2001-03,0,1108.1
    2001-04,1,1116.36
    2001-05,1,1117.8
    2001-06,1,1125.45
    2001-07,1,1137.46
    2001-08,1,1147.7
    2001-09,1,1207.6
    2001-10,1,1166.64
    2001-11,1,1169.7
    2001-12,0,1182.46
    2002-01,0,1190.82
    2002-02,0,1190.43
    2002-03,0,1194.85
    2002-04,0,1186.82
    2002-05,0,1186.9
    2002-06,0,1194.55
    2002-07,0,1199.26
    2002-08,0,1183.7
    2002-09,0,1197.1
    2002-10,0,1203.47""")

# after formatting, the data would look like this:
>>> df.head(2)
                 Date  USREC       M1
Date                                 
2000-12-01 2000-12-01  False  1088.40
2001-01-01 2001-01-01  False  1095.08

See how the lines are vertical:

enter image description here

An alternative approach would be to use plt.axvspan() which would automatically calculate the y1 and y2values.

PatrickT
  • 10,037
  • 9
  • 76
  • 111
  • An even more challenging example with similar problem of non-vertical lines: https://datascience.stackexchange.com/questions/88588/ – PatrickT Jan 16 '22 at 15:53