0

I am creating a barplot by using groupby showing the success rate of an individual for Calendar year 2012. This works well. X axis= S_L's and Y axis is the success rate%. I have a column in my dataset for the success (1 or 0).

ax=df[df['CY']==2012].groupby('S_L').success.mean().sort_values(ascending=False).plot(kind='bar',stacked=False)

Instead of showing the values for each of the barplots, I want to show the calculations behind the mean, i.e the total for each group and the count where success (which is a flag) =1 i.e. the numerator. For example: If the bar shows 90%, which is calculated by 9 (numerator) being successful/ 10 (overall count for the given S_L group), I want to show n=9 and n=10 for that bar.

I looked at these posts Add labels to barplots , and it works when I display the values for the bars. However, I don't know how to add the values for the calculation. As I am also sorting the values in descending order, I don't know how to do this. Please help.

My code:

import pandas as pd
from os import path

import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
fname=path.expanduser(r'Test file.xlsx')
df=pd.read_excel(io=fname,sheet_name='Sheet1')
ax=df.groupby('S_L').success.mean().sort_values(ascending=False).plot(kind='bar',stacked=False)
vals = ax.get_yticks()
ax.set_ylabel('Success Rate')
ax.set_yticklabels(['{:,.2%}'.format(x) for x in vals])

Below is the dataset image enter image description here

user728148
  • 159
  • 1
  • 6
  • 19
  • I think you would need to give up on the idea of putting the whole stack of commands in a single line. Then you can access the previously generated data within the loop you use to annotate each bar. – ImportanceOfBeingErnest Sep 28 '18 at 20:19
  • Thanks. I am not following your comment. Could you please elaborate? – user728148 Sep 28 '18 at 20:46
  • 2
    The labels you want to show are presumably somewhere in the dataframe to plot. So you want to have access to the plotted dataframe and use some index obtained from a loop to access this information. Therefore splitting up that command into a data-generation and a plot-generation part seems necessary. Note that I do not know how your dataframe looks like and I cannot guess from the information provided. If you want to show a [mcve] and use this to explain which labels you want to appear and where exactly the problem is with that, I can probably help further. – ImportanceOfBeingErnest Sep 28 '18 at 21:22
  • @ImportanceOfBeingErnest Extremely sorry for the delay. I have edited my response. could you please provide further guidance about how I can add the numerator and denominator in the label of the bar? – user728148 Oct 03 '18 at 21:25

0 Answers0