1

I am working on pandas dataframe (df) having below sample data:

0    Dec-16     N
1    Jan-17     N
2    Feb-17     Y
3    Feb-17     N
4    Jan-17     N
5    Mar-17     Y
6    Mar-17     Y
7    Jan-17     N
8    Jan-17     Y

using

df_group = df.groupby(['MMM-YY', 'Valid'])

I am getting below output:

MMM-YY  Valid
Dec-16      N      1
Feb-17      N      1
            Y      1
Jan-17      N      3
            Y      1
Mar-17      Y      2

I want to create a bar chart (displaying the bars in %age for Y & N) using this data but unfortunately unable to achieve that. I tried to convert the above output to a new dataframe but no luck.

Any pointers for resolving this would be really appreciated.

panr
  • 180
  • 1
  • 7
  • What's wrong with that output? I know it's a simplistic question, but we need to know what's wrong in order to give you a clear answer. What error are you getting, or how does this deviate from your expectations? Look at [ask]. – Arya McCarthy May 24 '17 at 09:54
  • I want to convert the output received from groupby to a dataframe, so that I can generate the bar chart. Thats what I think would work for me. In case there is any alternate approach which can help me to generate bar chart directly that would suffice. – panr May 24 '17 at 09:59
  • @aryamccarthy Thanks for the link, my requirement is quite similar to what you have shared. Let me work it out. Unfortunately I was not able to find this. – panr May 24 '17 at 10:12

1 Answers1

2

I think you need crosstab with normalize over each row + DataFrame.plot.bar:

df_group = df = pd.crosstab(df['MMM-YY'], df['Valid'], normalize=0) 
print (df_group)
Valid      N     Y
MMM-YY            
Dec-16  1.00  0.00
Feb-17  0.50  0.50
Jan-17  0.75  0.25
Mar-17  0.00  1.00

df_group.plot.bar()

graph

If need normalize per columns:

df_group1 = df = pd.crosstab(df['MMM-YY'], df['Valid'], normalize=1) 
print (df_group1)
Valid     N     Y
MMM-YY           
Dec-16  0.2  0.00
Feb-17  0.2  0.25
Jan-17  0.6  0.25
Mar-17  0.0  0.50

df_group1.plot.bar()

graph

If need count values only:

df1 = df = pd.crosstab(df['MMM-YY'], df['Valid']) 
print (df1)
Valid   N  Y
MMM-YY      
Dec-16  1  0
Feb-17  1  1
Jan-17  3  1
Mar-17  0  2

df1.plot.bar()

graph

jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252