I have a dataframe with one column as a string, which has values separated by commas. I want to flatten this into a column with only single string values with other columns becoming duplicates
df = pd.DataFrame({'a':['1,2','4','3,5'], 'b':['a','b','c'], 's':[.1,.2,.3]})
Which gives a dataframe like so:
a b c
'1,2' 'a' .1
'4' 'b' .2
'3,5' 'c' .3
I want to turn this into a dataframe that looks like:
df = pd.DataFrame({'a':['1','2','4','3','5'], 'b':['a','a','b','c','c'], 's':[.1,.1,.2,.3,.3]})
like:
a b c
'1' 'a' .1
'2' 'a' .1
'4' 'b' .2
'3' 'c' .3
'5' 'c' .3
I have attempted to start doing this by splitting the string columns
df = df.join(df[a].str.split(',', 1, expand=True))
which appends the split string column into new columns on the end, but I'm at a loss to finish the task. Any help is appreciated!