pandas.DataFrame.groupby
the desired features and aggregate the sum
.unstack
the grouped DataFrame
from a long, to a wide format.
- Remove level 0 of the multi-level column names created when unstacking
- Plot the stacked bar, with
pandas.DataFrame.plot
pandas.DataFrame.plot.bar
- If you have a lot of
'Feature'
, use .barh
for a horizontal bar, adjust figsize
as needed, and change plt.xlabel('Rework')
to ylabel
.
import pandas as pd
import matplotlib.pyplot as plt
# read data
df = pd.read_csv('test.csv')
# display(df)
Part_Number Serial_Number Timestamp Feature Machine Tool Rework
0 PN1 100 9/9/2020 8:26 FEAT_FN_H30 H10 93.0 1
1 PN1 101 9/9/2020 9:05 FEAT1_FN_H12 G3 85.0 2
2 PN1 101 9/9/2020 9:05 FEAT1_FN_H12 G3 85.0 3
3 PN1 101 9/9/2020 9:05 FEAT1_FN_H12 G4 85.0 2
4 PN1 102 9/9/2020 9:29 FEAT_FN_H23 H4 81.0 1
5 PN1 103 9/9/2020 10:53 FEAT1_FN_H15 H7 24.0 3
6 PN1 104 9/9/2020 10:53 FEAT1_FN_H17 H7 24.0 3
7 PN1 105 9/9/2020 10:53 FEAT1_FN_H19 NaN 24.0 1
8 PN1 106 9/9/2020 12:03 FEAT1_LF_LOC_BE F7 6.0 2
9 PN1 107 9/9/2020 12:03 FEAT1_LF_LOC_FN NaN NaN 12
10 PN1 108 9/9/2020 12:33 FEAT_FN_H42 NaN 93.0 2
11 PN1 109 9/9/2020 12:33 FEAT_FN_H43 H10 83.0 1
12 PN1 110 9/9/2020 12:33 FEAT_FN_H44 H10 83.0 1
13 PN1 111 9/9/2020 12:33 FEAT_FN_H45 H10 83.0 1
14 PN1 112 9/9/2020 12:33 FEAT_FN_H46 H10 83.0 1
# groupby and sum
dfg = df.groupby(['Feature', 'Machine']).agg({'Rework': sum}).unstack()
# fix column names
dfg.columns = dfg.columns.droplevel(0)
# sort the index by the sum of the rows
dfg = dfg.reindex(dfg.sum(axis=1).sort_values(ascending=False).index)
# plot the top 5
dfg.head().plot.bar(stacked=True, figsize=(8, 5), grid=True)
plt.ylabel('Rework')
plt.legend(title=dfg.columns.name, bbox_to_anchor=(1.05, 1), loc='upper left')

Test data
- Extra data was added for testing, because there where no examples with a
'Feature'
being produced by more than one 'Machine'
.
Part_Number,Serial_Number,Timestamp,Feature,Machine,Tool,Rework
PN1,100,9/9/2020 8:26,FEAT_FN_H30,H10,93,1
PN1,101,9/9/2020 9:05,FEAT1_FN_H12,G3,85,2
PN1,101,9/9/2020 9:05,FEAT1_FN_H12,G3,85,3
PN1,101,9/9/2020 9:05,FEAT1_FN_H12,G4,85,2
PN1,102,9/9/2020 9:29,FEAT_FN_H23,H4,81,1
PN1,103,9/9/2020 10:53,FEAT1_FN_H15,H7,24,3
PN1,104,9/9/2020 10:53,FEAT1_FN_H17,H7,24,3
PN1,105,9/9/2020 10:53,FEAT1_FN_H19,,24,1
PN1,106,9/9/2020 12:03,FEAT1_LF_LOC_BE,F7,6,2
PN1,107,9/9/2020 12:03,FEAT1_LF_LOC_FN,,,12
PN1,108,9/9/2020 12:33,FEAT_FN_H42,,93,2
PN1,109,9/9/2020 12:33,FEAT_FN_H43,H10,83,1
PN1,110,9/9/2020 12:33,FEAT_FN_H44,H10,83,1
PN1,111,9/9/2020 12:33,FEAT_FN_H45,H10,83,1
PN1,112,9/9/2020 12:33,FEAT_FN_H46,H10,83,1