2

I have a multiple bar charts created using different columns from a pandas Dataframe.

fig1 = plt.figure()
ypos = np.arange(len(dframe))

colorscheme = seaborn.color_palette(n_colors=4)

accuracyFig = fig1.add_subplot(221)
accuracyFig.bar(ypos,dframe['accuracy'], align = 'center', color=colorscheme)
accuracyFig.set_xticks([0,1,2,3])
accuracyFig.set_ylim([0.5,1])

sensitivityFig = fig1.add_subplot(222)
sensitivityFig.bar(ypos, dframe['sensitivity'], align = 'center',color=colorscheme )
sensitivityFig.set_xticks([0,1,2,3])
sensitivityFig.set_ylim([0.5,1])

specificityFig = fig1.add_subplot(223)
specificityFig.bar(ypos, dframe['specificity'], align = 'center', color=colorscheme)
specificityFig.set_xticks([0,1,2,3])
specificityFig.set_ylim([0.5,1])

precisionFig = fig1.add_subplot(224)
precisionFig.bar(ypos, dframe['precision'], align = 'center', color=colorscheme)
precisionFig.set_xticks([0,1,2,3])
precisionFig.set_ylim([0.5,1])

where dframe is a pandas dataframe with integar values. This outputs me the following figureenter image description here.

Each of the color corresponds to one of the classifier models - perceptron,C2,C3 and C4that are stored in the pandas dframe['name']

Now I want to plot a single legend for the whole figure. I tried the following

leg = plt.legend(dframe['name'])

Any help on how to plot the single legend and to place it down the figure in 2 colums.

But it gives me the followingenter image description here.

This is my dataframe

                     name        accuracy     sensitivity     specificity       precision
0              perceptron  0.820182164169  0.852518881235  0.755172413793  0.875007098643
1  DecisionTreeClassifier             1.0             1.0             1.0             1.0
2    ExtraTreesClassifier             1.0             1.0             1.0             1.0
3  RandomForestClassifier  0.999796774253  0.999889340748  0.999610678532  0.999806362379
jrjc
  • 21,103
  • 9
  • 64
  • 78
Raja Sattiraju
  • 1,262
  • 1
  • 20
  • 42

3 Answers3

5

Well, first, your table is not in a tidy format (see here: http://vita.had.co.nz/papers/tidy-data.pdf).

Having your table in a tidy (or long) format have the huge advantage that plotting becomes really easy with seaborn (among other advantages):

df # yours
                     name        accuracy     sensitivity     specificity       precision
0              perceptron  0.820182164169  0.852518881235  0.755172413793  0.875007098643
1  DecisionTreeClassifier             1.0             1.0             1.0             1.0
2    ExtraTreesClassifier             1.0             1.0             1.0             1.0
3  RandomForestClassifier  0.999796774253  0.999889340748  0.999610678532  0.999806362379

Convert it into a long format (or tidy):

df2 = pd.melt(df, value_vars=["accuracy", "sensitivity", "specificity", "precision"], id_vars="name")
df2
                      name     variable     value
0               perceptron     accuracy  0.820182
1   DecisionTreeClassifier     accuracy  1.000000
2     ExtraTreesClassifier     accuracy  1.000000
3   RandomForestClassifier     accuracy  0.999797
4               perceptron  sensitivity  0.852519
5   DecisionTreeClassifier  sensitivity  1.000000
6     ExtraTreesClassifier  sensitivity  1.000000
7   RandomForestClassifier  sensitivity  0.999889
8               perceptron  specificity  0.755172
9   DecisionTreeClassifier  specificity  1.000000
10    ExtraTreesClassifier  specificity  1.000000
11  RandomForestClassifier  specificity  0.999611
12              perceptron    precision  0.875007
13  DecisionTreeClassifier    precision  1.000000
14    ExtraTreesClassifier    precision  1.000000
15  RandomForestClassifier    precision  0.999806

Then, just plot what you want in one line + 2 lines to make it cleaner:

g = sns.factorplot(data=df2,
                   kind="bar",
                   col="variable", # you have 1 plot per variable, forming 1 line and 4 columns (4 different variables)
                   x="name", # in each plot the x-axis will be the name
                   y="value", # the height of the bar
                   col_wrap=2) # you actually want your line of plots to contain 2 plots maximum 
g.set_xticklabels(rotation=90) # rotate the labels so they don't overlap
plt.tight_layout() # fit everything into the figure

multiple barplot

HTH

jrjc
  • 21,103
  • 9
  • 64
  • 78
1

You can use the following to move your legend to where you need it to be in your graph.

The addition of labels when you plot the bar chart are necessary. I have changed the main lines where you plot your legend.

I have added some dummy labels, in your code, you would get your labels by doing labels = list(df) to give you a list of the column names in your dataframe.

import matplotlib.pyplot as plt

colorscheme = ['r','b','c','y']
fig1 = plt.figure()
accuracyFig = fig1.add_subplot(221)
A =[1,2,3,4]
B = [4,3,2,1]
labels = ['perceptron','C2','C3','C4']
for i in range(0,len(A)):
    accuracyFig.bar(A[i],B[i], align = 'center',label = labels[i], color = colorscheme[i])

accuracyFig1 = fig1.add_subplot(223)
A =[1,2,3,4]
B = [4,3,2,1]
labels = ['perceptron','C2','C3','C4']
for i in range(0,len(A)):
    accuracyFig1.bar(A[i],B[i], align = 'center',label = labels[i], color = colorscheme[i])

accuracyFig2 = fig1.add_subplot(222)
A =[1,2,3,4]
B = [4,3,2,1]
labels = ['perceptron','C2','C3','C4']
for i in range(0,len(A)):
    accuracyFig2.bar(A[i],B[i], align = 'center',label = labels[i], color = colorscheme[i])

accuracyFig3 = fig1.add_subplot(224)
A =[1,2,3,4]
B = [4,3,2,1]
labels = ['perceptron','C2','C3','C4']
for i in range(0,len(A)):
    accuracyFig3.bar(A[i],B[i], align = 'center',label = labels[i], color = colorscheme[i])

# Plot the legend:
# You don't want to plot to any particular axis, instead to a general plot.

plt.legend(loc = 'lower center',bbox_to_anchor = (0,-0.3,1,1),
        bbox_transform = plt.gcf().transFigure)
plt.show()

Sources for legend plot:

How to create custom legend in matplotlib based on the value of the barplot? how do I make a single legend for many subplots with matplotlib? How to put the legend out of the plot

enter image description here

Update: Accidentally deleted my comments: the addition of ncol = 2 within legend() will give the symmetrical split behavior you desire.

Community
  • 1
  • 1
Chuck
  • 3,664
  • 7
  • 42
  • 76
1

I have modified the code as follows

fig1 = plt.figure()

A = list(range(1,len(dframe)+1))
labels = dframe['name'].tolist()

colorscheme = sns.color_palette(n_colors=len(dframe))


accuracyFig = fig1.add_subplot(221)
for i in range(0,len(A)):
    accuracyFig.bar(A[i],dframe['accuracy'][i+1], align = 'center',label = labels[i], color = colorscheme[i])
accuracyFig.set_xticks([])
accuracyFig.set_ylim([0.5,1])
accuracyFig.set_title('Accuracy')

sensitivityFig = fig1.add_subplot(222)
for i in range(0,len(A)):
    sensitivityFig.bar(A[i],dframe['sensitivity'][i+1], align = 'center',label = labels[i], color = colorscheme[i])
sensitivityFig.set_xticks([])
sensitivityFig.set_ylim([0.5,1])
sensitivityFig.set_title('Sensitivity')

specificityFig = fig1.add_subplot(223)
for i in range(0,len(A)):
    specificityFig.bar(A[i],dframe['specificity'][i+1], align = 'center',label = labels[i], color = colorscheme[i])
specificityFig.set_xticks([])
specificityFig.set_ylim([0.5,1])
specificityFig.set_title('Specificity')

precisionFig = fig1.add_subplot(224)
for i in range(0,len(A)):
    precisionFig.bar(A[i],dframe['precision'][i+1], align = 'center',label = labels[i], color = colorscheme[i])
precisionFig.set_xticks([])
precisionFig.set_ylim([0.5,1])
precisionFig.set_title('Precision')

# Plot the legend:

plt.legend(loc = 'lower center',bbox_to_anchor = (0,-0.05,1,2), ncol=2,
        bbox_transform = plt.gcf().transFigure)

plt.show()

Instead of using fixed length of labels, I have directly copied them from the dataframe and it works.

I made some updates and also added the parameter (n_cols = 2) to the legend function such that my output figure looks like this enter image description here

Thanks @Charles Morris for the help

Raja Sattiraju
  • 1,262
  • 1
  • 20
  • 42