2

I have this table with strings delimiter "+"

ID  Products                
1   A + B + C + D + E ... 
2   A + F + G
3   X + D 

I would like to return in this format

ID  Products                 Product 1  Product 2  Product 3 Product 4 Product 5 product... 
1   A + B + C + D + E ...      A           B          C          D          E       ...
2   A + F + G                  A           F          G
3   X + D                      X           D
1   D + C + C + D + E          D           C          C          D          E      

How I can reproduce this in Pandas Dataframe?

Simon GIS
  • 1,045
  • 2
  • 18
  • 37

1 Answers1

3

Use Series.str.split with regex '\s+\+\s+' - it means one or more whitesapces, escaped +, one or more whitespaces, then change columns names by DataFrame.add_prefix and last add to original by DataFrame.join:

df1 = df['Products'].str.split('\s+\+\s+', expand=True).add_prefix('Product').fillna('')
df = df.join(df1)
print (df)
   ID           Products Product0 Product1 Product2 Product3 Product4
0   1  A + B + C + D + E        A        B        C        D        E
1   2          A + F + G        A        F        G                  
2   3              X + D        X        D                       

Also if necessary change column names:

d = lambda x: f'Product{x+1}'
df = (df.join(df['Products'].str.split('\s+\+\s+', expand=True)
                            .rename(columns=d)
                            .fillna('')))
print (df)
   ID           Products Product1 Product2 Product3 Product4 Product5
0   1  A + B + C + D + E        A        B        C        D        E
1   2          A + F + G        A        F        G                  
2   3              X + D        X        D                               
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252