How to "unfold" rows according to a column in Pandas

Question

There is this dataframe with a column which is actually a list:

import pandas as pd
df = pd.DataFrame([
    {"a":"a1", "b":"['b11','b12','b13']"},
    {"a":"a2", "b":"['b21','b22','b23']"}
])

which is just:

    a                    b
0  a1  ['b11','b12','b13']
1  a2  ['b21','b22','b23']

how can I have it unfolded like:

My first guess was:

from functools import reduce
vls = df.apply(lambda x: [{'a': x['a'], 'b': b} for b in list(eval(x['b']))], axis=1).values
df = pd.DataFrame(reduce(lambda x, y: x + y, vls))

It works, but it takes a huge time for a small set (~ 1000 rows) of my data, and I must apply it to millions of rows.

I wonder if exists a better way using pandas api only.

score 1 · Accepted Answer · answered Oct 17 '18 at 17:39

1

Try this:

df.groupby('a').apply(lambda df: pd.DataFrame({'a':[df.a.iloc[0]] * len(eval(df.b.iloc[0])),'b': eval(df.b.iloc[0])}))

Instead of using reduce, this uses groupby function to expand the rows - assuming your a column is unique.

answered Oct 17 '18 at 17:39

Rocky Li

5,641
2
17
33

much faster! thanks! – Thiago Melo Oct 17 '18 at 17:46

How to "unfold" rows according to a column in Pandas

1 Answers1