1

I have a large dataframe containing changelog of the data. Each line represents 'product' and each product can have multiple 'action types'. It can be new, deleted, renamed, moved, updated, etc. or it can be combination of them separated by comma.

Examples:

  • Move, Rename
  • Move, Rename, Update

What I am trying to achieve is, to replicate row for those ones, that have such combination for action type and create separate row for each of them. Here is example of what I have:

dataframe:

Product Description Effective date Action Type
Phone   Nokia       2019-08-08     Move, Text
Car     Honda       2018-12-12     Move, Text, Update
PC      Lenovo      2020-04-04     New

And what I want to achieve:

Product Description Effective date Action Type
Phone   Nokia       2019-08-08     Move
Phone   Nokia       2019-08-08     Text
Car     Honda       2018-12-12     Move
Car     Honda       2018-12-12     Text
Car     Honda       2018-12-12     Move, Text, Update
PC      Lenovo      2020-04-04     New

How this can be done?

ThePyGuy
  • 17,779
  • 5
  • 18
  • 45
SunWuKung
  • 31
  • 3

1 Answers1

3

You can use explode for this after you split your column by commas:

df['Action Type'] = df['Action Type'].str.split(', ')
df.explode('Action Type')

  Product Description Effective date Action Type
0   Phone       Nokia     2019-08-08        Move
0   Phone       Nokia     2019-08-08        Text
1     Car       Honda     2018-12-12        Move
1     Car       Honda     2018-12-12        Text
1     Car       Honda     2018-12-12      Update
2      PC      Lenovo     2020-04-04         New
sacuL
  • 49,704
  • 8
  • 81
  • 106