I would like to plot a bar graph, using pandas, that two categorical variables and 5 numeric columns. I would like to first group by one categorical variable and show the sum as grouped bars. I would also like to group by the second categorical variable, and have each bar show the second category as stacked bars.
A sample dataframe like mine can be constructed as follows:
import pandas as pd
l=100
df = pd.DataFrame({'op1': [random.randint(0,1) for x in range(l)],
'op2': [random.randint(0,1) for x in range(l)],
'op3': [random.randint(0,1) for x in range(l)],
'op4': [random.randint(0,1) for x in range(l)],
'op5': [random.randint(0,1) for x in range(l)],
'cat': random.choices(list('abcde'), k=l),
'gender': random.choices(list('mf-'), k=l)})
df.head()
cat gender op1 op2 op3 op4 op5
0 d m 1 1 1 1 1
1 a m 1 1 0 0 1
2 b - 1 0 1 0 1
3 c m 0 1 0 0 0
4 b - 0 0 1 1 0
5 c f 1 1 1 1 1
6 a - 1 1 0 1 0
7 d f 1 0 1 0 1
8 d m 1 1 0 1 0
9 b - 1 0 1 0 0
I can produce the grouped bar easily enough: df.groupby('cat')[['op%s' % i for i in range(1,6)]].sum().plot.bar()
But how can I get each bar to show the gender breakdown?