0

I'm trying to solve a python pandas problem where I have certain cells with semicolon split values. I would like to create a script where those values are automatically split.

Example:

import pandas as pd   
df =pd.DataFrame({'A':[7,9,3],'B':['a','b','c'],'C':[6,'2;6',4]}) #Current DataFrame 
df_new = pd.DataFrame({'A':[7,9,9,3],'B':['a','b','b','c'],'C':[6,2,6,4]})  #Dataframe to be created

Is there an easy method to do this?

Thank you in advance for any help

Beertje
  • 519
  • 2
  • 5
  • 14

1 Answers1

1

Use split() method and fillna() method:

df['C']=df['C'].str.split(';').fillna(df['C'])

Finally use explode() method:

df=df.explode('C',ignore_index=True)

Now If you print df you will get your desired output:

    A   B   C
0   7   a   6
1   9   b   2
2   9   b   6
3   3   c   4
Anurag Dabas
  • 23,866
  • 9
  • 21
  • 41