1

Consider the following data frame

                 A      B
2022-09-28     1.3    0.0
2022-09-29     1.3    0.0
2022-09-30     1.3    0.9
2022-10-01     1.3    0.9
2022-10-02     0.0    0.9
2022-10-03     0.0    0.9
2022-10-04     0.0    0.0
2022-10-05     0.1    0.0
2022-10-06     0.1    0.0
2022-10-07     0.1    0.0

I would need a horizontal bar plot with two vertical levels (A and B) and the date on the x-axis. The length of the bars (barwidth) equals the time intervals of nonzero values and the linewidth (bar height) the average of the values.

For the example we would have two bars at the A level. The first one from 2022-09-28 to 2022-10-01 with linewidth 1.3 and the second one from 2022-10-05 to 2022-10-07 with linewidth 0.1. On the B level there would be one bar only from 2022-09-30 to 2022-10-03 with linewidth 0.9.

This is pretty close, but it is a solution for one bar per level only.

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
clueless
  • 313
  • 1
  • 10

1 Answers1

0

You can use Matplotlib's broken_barh function to plot various discontinuous ('broken') horizontal bar plots. The trick is to feed the data in the dataframe to broken_barh correctly: you need to create (x start, x duration) for each part of the discontinuous bar plot.

For example, A has two parts as you mentioned. One part would be (2022-09-28, 3 days) with linewidth 1.3, and the second part would be (2022-10-05, 2 days) with linewidth 0.1. We would feed broken_barh with x ranges [(2022-09-28, 3 days), (2022-10-05, 2 days)] and linewidth [1.3, 0.1].

import matplotlib.pyplot as plt
import pandas as pd


if __name__ == '__main__':

    # Create dataframe    
    df_dates = pd.date_range('2022-09-28', periods=10)
    df = pd.DataFrame({'A': [1.3, 1.3, 1.3, 1.3, 0.0, 0.0, 0.0, 0.1, 0.1, 0.1],
                       'B': [0.0, 0.0, 0.9, 0.9, 0.9, 0.9, 0.0, 0.0, 0.0, 0.0]},
                      index=df_dates)

    # xranges and linewidth we will feed matplotlib later
    a_xranges = []
    a_lw = []

    # Find unique values in A - aka the parts of the A broken bar plot
    for lw_value in df['A'].unique().tolist():
        if lw_value != 0.0:  # skip 0.0, we will not plot it
            
            # Find rows where linewidth for the A values is located
            idx = df.index[df['A'] == lw_value].tolist()
            sub_df = df.loc[idx]

            # Find where the bar plot starts for x axis and x duration
            x_range = (sub_df.index[0], sub_df.index[-1] - sub_df.index[0])  # (x start, x duration)

            # Add x range and linewidth values for that part
            a_xranges.append(x_range)
            a_lw.append(lw_value)

a_xranges and a_lw are in the correct format for broken_barh. Matplotlib will manage the pandas dates so you don't have to worry about date formatting.

You can repeat the same for B - you could also make a function and call it instead of adding the same loop to clean up your code.

    b_xranges = []
    b_lw = []

    for lw_value in df['B'].unique().tolist():
        if lw_value != 0.0:
            idx = df.index[df['B'] == lw_value].tolist()
            sub_df = df.loc[idx]
            x_range = (sub_df.index[0], sub_df.index[-1] - sub_df.index[0])   
            b_xranges.append(x_range)
            b_lw.append(lw_value)

    # Start figure
    fig, ax = plt.subplots(figsize=(12, 5))
    # Plot A bar plot
    # The (10,9) is for the y axis (ymin, y duration)
    ax.broken_barh(a_xranges, (10, 9), edgecolor='k', facecolor='white', linewidth=a_lw)
    # Plot B bar plot
    ax.broken_barh(b_xranges, (20, 9), edgecolor='k', facecolor='white', linewidth=b_lw)

    ax.set_ylabel("Level")
    ax.set_yticks([15, 25], labels=['A', 'B'])
    ax.set_xlabel("Date")
    plt.show()

example image

If you wanted the bars closer, narrower, etc... you could play around with the y-values (10,9) and (20,9) values I gave them. Hope this helps - cheers!