I have dataframe like this:
Col1 col2 col3
test0 [1,2,3] [ab,bc,cd]
Output dataframe I want is:
col1 col2 col3
test0 1 ab
test0 2 bc
test0 3 cd
There would be multiple column like col2 with same length of list
I have dataframe like this:
Col1 col2 col3
test0 [1,2,3] [ab,bc,cd]
Output dataframe I want is:
col1 col2 col3
test0 1 ab
test0 2 bc
test0 3 cd
There would be multiple column like col2 with same length of list
You can do:
outputdf_expandedcols=pd.DataFrame({
"col2":df.apply(lambda x: pd.Series(x['col2']),axis=1).stack().reset_index(level=1, drop=True),
"col3":df.apply(lambda x: pd.Series(x['col3']),axis=1).stack().reset_index(level=1, drop=True)
})
outputdf = df[['Col1']].join(outputdf_expandedcols,how='right')
outputdf
will be:
Col1 col2 col3
0 test0 1 ab
0 test0 2 bc
0 test0 3 cd
If you have more columns to expand you can use a dict comprehension:
list_of_cols_to_expand = ["col2", "col3"] # put here the column names you want to expand
outputdf_expandedcols=pd.DataFrame({
col:df.apply(lambda x: pd.Series(x[col]),axis=1).stack().reset_index(level=1, drop=True) for col in list_of_cols_to_expand
})
outputdf = df[['Col1']].join(outputdf_expandedcols,how='right')
Output same as above.
This answer is based on this thread.
If you have an up to date version of pandas, you can also do:
cols_to_expand = ["col2", "col3"] # or more columns if you have more
outputdf = df.explode(cols_to_expand)
outputdf
will be:
Col1 col2 col3
0 test0 1 ab
0 test0 2 bc
0 test0 3 cd
To have a compatible Pandas version in Google Colab, you need to run a cell (based on this):
%%shell
pip install --upgrade --force-reinstall pandas
pip install -I pandas
pip install --ignore-installed pandas
then restart kernel (By clicking Runtime
, then Restart runtime
).