1

I have dataframe like this:

Col1    col2     col3
test0   [1,2,3]  [ab,bc,cd]

Output dataframe I want is:

col1   col2  col3
test0  1      ab
test0  2      bc
test0  3      cd

There would be multiple column like col2 with same length of list

Dcook
  • 899
  • 7
  • 32
  • take a look at [explode](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.explode.html) – Tranbi Nov 07 '21 at 10:06

2 Answers2

1

You can do:

outputdf_expandedcols=pd.DataFrame({
    "col2":df.apply(lambda x: pd.Series(x['col2']),axis=1).stack().reset_index(level=1, drop=True),
    "col3":df.apply(lambda x: pd.Series(x['col3']),axis=1).stack().reset_index(level=1, drop=True)
})

outputdf = df[['Col1']].join(outputdf_expandedcols,how='right')    

outputdf will be:

    Col1  col2 col3
0  test0     1   ab
0  test0     2   bc
0  test0     3   cd

If you have more columns to expand you can use a dict comprehension:

list_of_cols_to_expand = ["col2", "col3"] # put here the column names you want to expand
outputdf_expandedcols=pd.DataFrame({
    col:df.apply(lambda x: pd.Series(x[col]),axis=1).stack().reset_index(level=1, drop=True) for col in list_of_cols_to_expand
})

outputdf = df[['Col1']].join(outputdf_expandedcols,how='right')

Output same as above.

This answer is based on this thread.

zabop
  • 6,750
  • 3
  • 39
  • 84
0

If you have an up to date version of pandas, you can also do:

cols_to_expand = ["col2", "col3"] # or more columns if you have more
outputdf = df.explode(cols_to_expand)

outputdf will be:

    Col1 col2 col3
0  test0    1   ab
0  test0    2   bc
0  test0    3   cd

To have a compatible Pandas version in Google Colab, you need to run a cell (based on this):

%%shell
pip install --upgrade --force-reinstall pandas
pip install -I pandas
pip install --ignore-installed pandas

then restart kernel (By clicking Runtime, then Restart runtime).

zabop
  • 6,750
  • 3
  • 39
  • 84