0

I have been trying to understand the answer of this post in order to populate two different legends.

I create a clustered stacked bar plot with different hatches for each bar and my code below is a bit different from the answer of the aforementioned post.

But I have not been able to figure out how to get one legend with the colors and one legend with the hatches.

The color legend should correspond to A, B, C, D, E and the hatch legend should indicate "with" if bar is hatched and "without" if non-hatched.

import matplotlib.pyplot as plt
from matplotlib.colors import LinearSegmentedColormap as coloring

# copy the dfs below and use pd.read_clipboard() to reproduce
df_1
     A   B   C   D   E
Mg  10  15  23  25  27
Ca  30  33   0  20  17

df_2
     A   B   C   D   E
Mg  20  12   8  40  10
Ca   7  26  12  22  16

hatches=(' ', '//')
colors_ABCDE=['tomato', 'gold', 'greenyellow', 'forestgreen', 'palevioletred']
dfs=[df_1,df_2]

for each_df, df in enumerate(dfs):
    df.plot(ax=plt.subplot(111), kind="barh", \
            stacked=True, hatch=hatches[each_df], \
            colormap=coloring.from_list("my_colormap", colors_ABCDE), \
            figsize=(7,2.5), position=len(dfs)-each_df-1, \
            align='center', width=0.2, edgecolor="darkgrey")

plt.legend(loc='center left', bbox_to_anchor=(1.0, 0.5), fontsize=12)

The plot I manage to get is:

plot

Any ideas how to create two legends and place them one next to the other or one below the other? Thanks in advance ^_^

Newbielp
  • 431
  • 3
  • 16

2 Answers2

1

Since adding legends in matplotlib is a complex, extensive step, consider using the very link you cite with function solution by @jrjc. However, you will need to adjust function to your horizontal bar graph needs. Specifically:

  • Add an argument for color map and in DataFrame.plot call
  • Adjust bar plot from kind='bar' to kind='barh' for horizontal version
  • Swap x for y in line: rect.set_y(rect.get_y() + 1 / float(n_df + 1) * i / float(n_col))
  • Swap width for height in line: rect.set_height(1 / float(n_df + 1))
  • Adjust axe.set_xticks and axe.set_xticklabels for np.arange(0, 120, 20) values

Function

import numpy as np
import pandas as pd
import matplotlib.cm as cm
import matplotlib.pyplot as plt
from matplotlib.colors import LinearSegmentedColormap as coloring

def plot_clustered_stacked(dfall, labels=None, title="multiple stacked bar plot", H="//",
                            colors_ABCDE=['tomato', 'gold', 'greenyellow', 'forestgreen', 'palevioletred'], **kwargs):
    """
       CREDIT: @jrjc (https://stackoverflow.com/a/22845857/1422451)

       Given a list of dataframes, with identical columns and index, create a clustered stacked bar plot. 
       labels is a list of the names of the dataframe, used for the legend
       title is a string for the title of the plot
       H is the hatch used for identification of the different dataframe
    """

    n_df = len(dfall)
    n_col = len(dfall[0].columns) 
    n_ind = len(dfall[0].index)
    axe = plt.subplot(111)

    for df in dfall : # for each data frame
        axe = df.plot(kind="barh",
                      linewidth=0,
                      stacked=True,
                      ax=axe,
                      legend=False,
                      grid=False,
                      colormap=coloring.from_list("my_colormap", colors_ABCDE),
                      edgecolor="darkgrey",
                      **kwargs)  # make bar plots

    h,l = axe.get_legend_handles_labels() # get the handles we want to modify
    for i in range(0, n_df * n_col, n_col): # len(h) = n_col * n_df
        for j, pa in enumerate(h[i:i+n_col]):
            for rect in pa.patches: # for each index
                rect.set_y(rect.get_y() + 1 / float(n_df + 2) * i / float(n_col))
                rect.set_hatch(H * int(i / n_col)) #edited part     
                rect.set_height(1 / float(n_df + 2))

    axe.set_xticks(np.arange(0, 125, 20))
    axe.set_xticklabels(np.arange(0, 125, 20).tolist(), rotation = 0)
    axe.margins(x=0, tight=None)
    axe.set_title(title)

    # Add invisible data to add another legend
    n=[]        
    for i in range(n_df):
        n.append(axe.bar(0, 0, color="gray", hatch=H * i, edgecolor="darkgrey"))

    l1 = axe.legend(h[:n_col], l[:n_col], loc=[1.01, 0.5])
    if labels is not None:
        l2 = plt.legend(n, labels, loc=[1.01, 0.1]) 
    axe.add_artist(l1)
    return axe

Call

plt.figure(figsize=(10, 4))
plot_clustered_stacked([df_1, df_2],["df_1", "df_2"])
plt.show()

plt.clf()
plt.close()

Output

Plot Output

Parfait
  • 104,375
  • 17
  • 94
  • 125
  • I preferred not to use this code because I find it unnecessarily complex and I tend to lose contact when ```rect``` is used. Thus, I stack with my own piece and I just used parts and bits that add the second legend. – Newbielp May 04 '20 at 15:43
  • 1
    Makes sense. The linked function was a generalized solution that for specific end users like yourself need tailored adjustments. But glad you found a solution. Happy coding! – Parfait May 04 '20 at 17:04
1

I thought that this function solution by @jrjc is rather perplexing for my understanding and thus, I preferred to alter my own thing a little and adjust it.

So, it took my some time to understand that when a second legend is created for a plot, python automatically erases the first one and this is when add_artist() must be employed.

The other prerequisite in order to add the second legend is to name the plot and apply the .add_artist() method to that specific plot, so that python knows where to stick that new piece.

In short, this is how I managed to create the plot I had in mind and I hope that the comments will make it somehow clearer and useful for anyone.

import matplotlib.pyplot as plt
from matplotlib.colors import LinearSegmentedColormap as coloring
import matplotlib.patches as mpatches
# copy the dfs below and use pd.read_clipboard() to reproduce
df_1
     A   B   C   D   E
Mg  10  15  23  25  27
Ca  30  33   0  20  17
df_2
     A   B   C   D   E
Mg  20  12   8  40  10
Ca   7  26  12  22  16

hatches=(' ', '//')
colors_ABCDE=['tomato', 'gold', 'greenyellow', 'forestgreen', 'palevioletred']
dfs=[df_1,df_2]
for each_df, df in enumerate(dfs):
    #I name the plot as "figure"
    figure=df.plot(ax=plt.subplot(111), kind="barh", \
            stacked=True, hatch=hatches[each_df], \
            colormap=coloring.from_list("my_colormap", colors_ABCDE), \
            figsize=(7,2.5), position=len(dfs)-each_df-1, \
            align='center', width=0.2, edgecolor="darkgrey", \
            legend=False) #I had to False the legend too
legend_1=plt.legend(df_1.columns, loc='center left', bbox_to_anchor=(1.0, 0.5), fontsize=12)

patch_hatched = mpatches.Patch(facecolor='beige', hatch='///', edgecolor="darkgrey", label='hatched')
patch_unhatched = mpatches.Patch(facecolor='beige', hatch=' ', edgecolor="darkgrey", label='non-hatched')
legend_2=plt.legend(handles=[patch_hatched, patch_unhatched], loc='center left', bbox_to_anchor=(1.15, 0.5), fontsize=12)

# as soon as a second legend is made, the first disappears and needs to be added back again
figure.add_artist(legend_1) #python now knows that "figure" must take the "legend_1" along with "legend_2"

plot with two legends

I am pretty sure that it can be even more elegant and automated.

Newbielp
  • 431
  • 3
  • 16