Split the entries in dataframe

Question

here is my sample data:

df=pd.DataFrame({'Name':['A,B','C','D','E,F,G']
              ,'Age':[4,6,8,9]})

My expected output is to split the entries if there are more than one names.

pd.DataFrame({'Name':['A','B','C','D','E','F','G']
              ,'Age':[4,4,6,8,9,9,9]})

I can only split the name but I don't now how to make it duplicated entries. For first row, there are A,B under the Name, so I want to make it into two separate rows, A and B both are aged 4. Similarly, E,F,G are all aged at 9, so I want to convert this row into three rows with same age at 9.

df['Name'].apply(lambda x : x.split(','))

`The output data is self-explaining:` Why does 4 get duplicated twice, 6 and 8 duplicated once, and 9 gets duplicated three times? — Nick ODell, Aug 18 '23 at 17:47
@NickODell They just want to `explode`. Maybe you misread it because the code style was really dense. I edited to improve it though they just reverted it by accident. — wjandrea, Aug 18 '23 at 17:51

score 1 · Accepted Answer · answered Aug 18 '23 at 17:50

1

Combine df.set_index + str.split + explode:

df.set_index('Age')['Name'].str.split(',').explode().reset_index()

   Age Name
0    4    A
1    4    B
2    6    C
3    8    D
4    9    E
5    9    F
6    9    G

answered Aug 18 '23 at 17:50

RomanPerekhrest

88,541
4
65
105

Why set and reset the index? You can just `.explode('Name')`. – wjandrea Aug 18 '23 at 17:52
@wjandrea, you will get nothing with `df.explode('Name')` – RomanPerekhrest Aug 18 '23 at 17:53
1

You'd just need to do the split differently, e.g. `df.assign(Name=df['Name'].str.split(',')).explode('Name')` – wjandrea Aug 18 '23 at 17:54

score 0 · Answer 2 · edited Aug 18 '23 at 17:59

0

I've found a solution:

df1['New Name'] = df1['Name'].str.split(',')

df1 = df1.explode("New Name")

df1.drop(columns='Name')

edited Aug 18 '23 at 17:59

ouroboros1

9,113
3
7
26

answered Aug 18 '23 at 17:54

Derek

87
6

Split the entries in dataframe

2 Answers2