I think melt (as discussed here) may potentially be useful for this, but I can't quite figure out how to use it to solve my problem.
I'm starting with a complex dictionary like this:
order = [
{
"order_id" : 0,
"lines" : [
{
"line_id" : 1,
"line_amount" : 3.45,
"line_description" : "first line"
},
{
"line_id" : 2,
"line_amount" : 6.66,
"line_description" : "second line"
},
{
"line_id" : 3,
"line_amount" : 5.43,
"line_description" : "third line"
},
]
},
{
"order_id" : 1,
"lines" : [
...
}
]
I want a DataFrame with one row per order line (not one row per order) that still includes the original order's attributes (which in this example is just the order_id). Currently the most efficient way to achieve this I've come up with is:
# Orders DataFrame
odf = pandas.DataFrame(orders)
line_dfs = []
for oid, line_list in odf.iterrows():
line_df = pandas.DataFrame(line_list).copy()
line_df["order_id"] = oid
line_dfs += [ line_df ]
# Line DataFrame
ldf = pandas.concat(line_dfs, sort=False, ignore_index=True).copy()
Is there a more efficient, "vectorized" way to .apply something to achieve this?
ldf = odf.lines.apply(...?...)
Thanks for any help, including just a link to an answer on SO or elsewhere that already addresses this and that I just haven't found yet.