There is this dataframe with a column which is actually a list:
import pandas as pd
df = pd.DataFrame([
{"a":"a1", "b":"['b11','b12','b13']"},
{"a":"a2", "b":"['b21','b22','b23']"}
])
which is just:
a b
0 a1 ['b11','b12','b13']
1 a2 ['b21','b22','b23']
how can I have it unfolded like:
a b
0 a1 b11
1 a1 b12
2 a1 b13
3 a2 b21
4 a2 b22
5 a2 b23
My first guess was:
from functools import reduce
vls = df.apply(lambda x: [{'a': x['a'], 'b': b} for b in list(eval(x['b']))], axis=1).values
df = pd.DataFrame(reduce(lambda x, y: x + y, vls))
It works, but it takes a huge time for a small set (~ 1000 rows) of my data, and I must apply it to millions of rows.
I wonder if exists a better way using pandas api only.