1

I have two data frames - one data frame consists of all metric values on a monthly basis, and another one consists of lower and upper bounds for a given period.

import pandas as pd

# sample data
data1 = {'Metric_Month': [pd.Timestamp('2022-05-01 00:00:00'), pd.Timestamp('2022-04-01 00:00:00'),
                          pd.Timestamp('2022-03-01 00:00:00'), pd.Timestamp('2022-02-01 00:00:00'),
                          pd.Timestamp('2022-01-01 00:00:00'), pd.Timestamp('2021-12-01 00:00:00'),
                          pd.Timestamp('2021-11-01 00:00:00'), pd.Timestamp('2021-10-01 00:00:00'),
                          pd.Timestamp('2021-09-01 00:00:00')],
         'Metric_Value': [0.2876, 0.3002, 0.3169, 0.3137, 0.3022, 0.3071, 0.3122, 0.307, 0.3091]}
df1 = pd.DataFrame(data1)

data2 = {'start_date': [pd.Timestamp('2021-11-01 00:00:00'), pd.Timestamp('2021-10-01 00:00:00')],
         'end_date': [pd.Timestamp('2022-05-01 00:00:00'), pd.Timestamp('2022-04-01 00:00:00')],
         'lower_bound': [0.2954, 0.2991], 'upper_bound': [0.3221, 0.3206]}
df2 = pd.DataFrame(data2)

Metric data frame:

Metric Month    Metric Value
2022-05-01  0.2876
2022-04-01  0.3002
2022-03-01  0.3169
2022-02-01  0.3137
2022-01-01  0.3022
2021-12-01  0.3071
2021-11-01  0.3122
2021-10-01  0.3070
2021-09-01  0.3091

Bounds data frame:

    start_date end_date    lower_bound upper_bound
   2021-11-01  2022-05-01  0.2954      0.3221
   2021-10-01  2022-04-01  0.2991      0.3206

My goal is to plot these (different but overlapping) lower and upper bounds on the same scatter plot of metric values.

My idea is to expand the data frame so that new dates are added in rows between the start and the end (included) dates as I want to do fill_between plots on this data set.

In our example a new data frame would look like this (end_date column can be removed):

start_date  end_date    lower_bound upper_bound
2021-11-01  2022-05-01  0.2954      0.3221
2021-12-01  2022-05-01  0.2954      0.3221
2022-01-01  2022-05-01  0.2954      0.3221
...
2022-05-01  2022-05-01  0.2954      0.3221
2021-10-01  2022-04-01  0.2991      0.3206
2021-11-01  2022-04-01  0.2991      0.3206
2021-12-01  2022-04-01  0.2991      0.3206
...
2022-04-01  2022-04-01  0.2991      0.3206

The final result should be a plot like this:

enter image description here

How to do this, and do I need to extend the bounds df in such a way?

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
Lior
  • 147
  • 1
  • 10

1 Answers1

1
import matplotlib.pyplot as plt

# add a color column to df2
df2['color'] = ['yellow', 'green']

# plot df1
ax = df1.plot(x='Metric_Month', marker='o', figsize=(10, 8))

# iterate through each row of df2 to add horizontal lines and fill between
for i, d in df2.iterrows():
    ax.hlines(y=[d.lower_bound, d.upper_bound], xmin=d.start_date, xmax=d.end_date, colors=d.color, label=f'Index: {i}')
    ax.fill_between(x=[d.start_date, d.end_date], y1=d.lower_bound, y2=d.upper_bound, color=d.color, alpha=0.2)
ax.legend(loc='lower left')
plt.show()

enter image description here

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158