1

I have a DF, which looks like this:

id      value     country
215     x, y      UK
360     z         Spain

I'd like to split it into this form:

id      value     country
215     x         UK
215     y         UK
360     z         Spain

So, I want to duplicate the rows for each row where df['value'] has more than one value split with comma.

I know I have to split it into a list:

df['value'] = df['value'].apply(lambda x: x.split(','))

What do you do next to duplicate the row the way I want to?

1 Answers1

1

This should work. It uses the str.split functions on the ['value'] Series:

import pandas as pd

df = pd.DataFrame({'ID': [215, 360], 'value':  ['x, y', 'z'], 'country': ["UK", "Spain"]})
df["value"] = df["value"].str.split(pat=",")
print(df.explode("value"))

Result:

    ID value country
0  215     x      UK
0  215     y      UK
1  360     z   Spain
JarroVGIT
  • 4,291
  • 1
  • 17
  • 29