I am struggling with trying to split my dataframe into 3 new dataframes where the split occurs when the supplier name changes. I have searched for existing questions. How to split dataframe on based on columns row and Pandas & python: split dataframe into many dataframes based on column value containing substring came close but I cannot get the output that I want.
Here is attached a toy dataset to illustrate my problem:
df = pd.DataFrame({'Supplier': ['Supplier1', 'Supplier1', 'Supplier2', 'Supplier2', 'Supplier2', 'Supplier3','Supplier3'], 'Class' : ['A', 'A','A','A','A','B','B']})
and I tried (unsuccessfully)
df1 = df.iloc[:df.index[df['Supplier'] == 'Supplier1'].tolist()[0]]
df2 = df.iloc[df.index[df['Supplier'] == 'Supplier2'].tolist()[0]+1:]
df3 = df.iloc[df.index[df['Supplier'] == 'Supplier3'].tolist()[0]+1:]
The result that I am trying to achieve is:
Supplier Class
0 Supplier1 A
1 Supplier1 A
Supplier Class
0 Supplier2 A
1 Supplier2 A
2 Supplier2 A
Supplier Class
0 Supplier3 B
1 Supplier3 B
Any help on this would be greatly appreciated. Thank you!
UPDATE: Using:
df1 = {i:group for i,group in df.groupby( df['Supplier'].ne(df['Supplier'].shift()).cumsum() )}
gives:
{1: Supplier Class
0 Supplier1 A
1 Supplier1 A, 2: Supplier Class
2 Supplier2 A
3 Supplier2 A
4 Supplier2 A, 3: Supplier Class
5 Supplier3 B
6 Supplier3 B}
I need to split into separated dataframe so I did:
df3 = pd.DataFrame.from_dict({i:group for i,group in df1.groupby(df1['Supplier'].ne(df1['Supplier'].shift()).cumsum() )},orient='index', columns= ['Class'])
but it gives the error
df3 = pd.DataFrame.from_dict({i:group for i,group in df1.groupby(df1['Supplier'].ne(df1['Supplier'].shift()).cumsum() )},orient='index', columns= ['Class'])
AttributeError: 'dict' object has no attribute 'groupby'