I have a dataframe with some columns delimited with '|', and I need to flatten this dataframe. Example:
name type
a l
b m
c|d|e n
For this df, I want to flatten it to:
name type
a l
b m
c n
d n
e n
To do this, I used this command:
df = df.assign(name=df.name.str.split('|')).explode(column).drop_duplicates()
Now, I want do one more thing besides above flatten operation:
name type co_occur
a l
b m
c n d
c n e
d n e
That is, not only split the 'c|d|e' into two rows, but also create a new column which contains a 'co_occur' relationship, in which 'c' and 'd' and 'e' co-occur with each other.
I don't see an easy way to do this by modifying:
df = df.assign(name=df.name.str.split('|')).explode(column).drop_duplicates()