1

I am working on a below problem :

df_temp = pd.DataFrame() 
df_temp.insert(0, 'Label', ["A|B|C","A|C","C|B","A","B"])  
df_temp.insert(1, 'ID', [1,2,3,4,5])  
df_temp 

    Label   ID
0   A|B|C   1
1   A|C     2
2   C|B     3
3   A       4
4   B       5 

I want to convert this dataframe into something like below dataframe, where I can separate Labels for ID column.

Expected Output :

   ID  Label
    1  A
    1  B
    1  C
    2  A
    2  C
    3  C
    3  B
    4  A
    5  B
KMittal
  • 602
  • 1
  • 7
  • 21

1 Answers1

2

Try this:

(df_temp.set_index('ID')['Label']
        .str.split('|', expand=True)
        .reset_index()
        .melt('ID')
        .drop('variable', axis=1)
        .dropna()
        .sort_values('ID'))

Output:

    ID value
0    1     A
5    1     B
10   1     C
1    2     A
6    2     C
2    3     C
7    3     B
3    4     A
4    5     B
cs95
  • 379,657
  • 97
  • 704
  • 746
Scott Boston
  • 147,308
  • 15
  • 139
  • 187