0

here is my sample data:

df=pd.DataFrame({'Name':['A,B','C','D','E,F,G']
              ,'Age':[4,6,8,9]})

My expected output is to split the entries if there are more than one names.

pd.DataFrame({'Name':['A','B','C','D','E','F','G']
              ,'Age':[4,4,6,8,9,9,9]})

I can only split the name but I don't now how to make it duplicated entries. For first row, there are A,B under the Name, so I want to make it into two separate rows, A and B both are aged 4. Similarly, E,F,G are all aged at 9, so I want to convert this row into three rows with same age at 9.

df['Name'].apply(lambda x : x.split(','))

Derek
  • 87
  • 6
  • `The output data is self-explaining:` Why does 4 get duplicated twice, 6 and 8 duplicated once, and 9 gets duplicated three times? – Nick ODell Aug 18 '23 at 17:47
  • BTW, this is simpler: `df['Name'].str.split(',')` – wjandrea Aug 18 '23 at 17:49
  • @NickODell They just want to `explode`. Maybe you misread it because the code style was really dense. I edited to improve it though they just reverted it by accident. – wjandrea Aug 18 '23 at 17:51

2 Answers2

1

Combine df.set_index + str.split + explode:

df.set_index('Age')['Name'].str.split(',').explode().reset_index()

   Age Name
0    4    A
1    4    B
2    6    C
3    8    D
4    9    E
5    9    F
6    9    G
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105
0

I've found a solution:

df1['New Name'] = df1['Name'].str.split(',')

df1 = df1.explode("New Name")

df1.drop(columns='Name')
ouroboros1
  • 9,113
  • 3
  • 7
  • 26
Derek
  • 87
  • 6