It was a bit tricky to explain the problem. I want to split a cell containing multiple string values delimited by commas into different rows. The df below is a small example but the real dataset contains up to 15 columns and 15 rows and each cell has 5 to 6 non unique string values in it that I need to separate into different rows.
How can we split the original df to the transformed df?
Original df
import pandas as pd
df = pd.DataFrame({"Privileges":['Type1','Type2','Type3'],"Super_Admin":["A1,A2,A3,A4","A1,B1,B2,B3, A4","C1,B2,C2,C3"], "Admin":["A1,A2","A1,B1,B2","B2, C1,C2"])
Index | Privileges | Super_Admin | Admin |
---|---|---|---|
0 | Type1 | A1,A2,A3,A4 | A1,A2 |
1 | Type2 | A1,B1,B2,B3, A4 | A1,B1,B2 |
2 | Type3 | C1,B2,C2,C3 | B2, C1,C2 |
Transformed df
df = pd.DataFrame({"Privileges":['Type1','Type1','Type1','Type1','Type2','Type2','Type2','Type2','Type2','Type3','Type3','Type3','Type3'],"Super_Admin":["A1","A2","A3","A4","A1", "B1","B2","B3", "A4","C1","B2","C2","C3"], "Admin":["A1","A2",'', '',"A1","B1","B2",'', '', "B2", "C1","C2", '']})
Index | Privileges | Super_Admin | Admin |
---|---|---|---|
0 | Type1 | A1 | A1 |
1 | Type1 | A2 | A2 |
2 | Type1 | A3 | -- |
3 | Type1 | A4 | -- |
4 | Type2 | A1 | A1 |
5 | Type2 | B1 | B1 |
6 | Type2 | B2 | B2 |
7 | Type2 | B3 | -- |
8 | Type2 | A4 | -- |
9 | Type3 | C1 | B2 |
10 | Type3 | B2 | C1 |
11 | Type3 | C2 | C2 |
12 | Type3 | C3 | -- |