0

I have an excel file with 200 rows, 2 of which have comma separated values in them. If I output them to tab-separated, it would look like this:

col1  col2
a     b,c
d     e,f,g

I need to explode to get a dataframe like this, exploding 200 rows into ~4,000:

col1  col2
a     b
a     c
d     e
d     f
d     g

I don't see any explode functionality in pandas and haven't been able to figure out how to do this having the columns of comma-separated values uneven in length - not sure how split would work here.

Help me stack-overflow, you're my only hope. Thanks!

Super_John
  • 1,767
  • 2
  • 14
  • 27
  • I found this, but I do not have my data is a list format - not sure if that's a helpful thread for anyone https://stackoverflow.com/questions/32468402/how-to-explode-a-list-inside-a-dataframe-cell-into-separate-rows – Super_John Jun 05 '17 at 03:18

1 Answers1

2

Let's use pd.DataFrame, .str.split, stack:

df_out = (pd.DataFrame(df.col2.str.split(',').tolist(), index=df.col1)
      .stack()
      .reset_index()
      .drop('level_1',axis=1)
      .rename(columns={0:'col2'}))

Output:

  col1 col2
0    a    b
1    a    c
2    d    e
3    d    f
4    d    g
Scott Boston
  • 147,308
  • 15
  • 139
  • 187