1

Let's say I have something like this:

df = pd.DataFrame({'key':[1,2,3], 'type':[[1,3],[1,2,3],[1,2]], 'value':[5,1,8]})

key type        value
1   [1, 3]      5
2   [1, 2, 3]   1
3   [1]         8

Where one of the columns contains a list of items. I would like to create several rows for each row that contains multiple types.

Ontaining this:

key type  value
1   1     5
1   3     5
2   1     1
2   2     1
2   3     1
3   1     8

I've been playing with apply with axis=1 but I can't find a way to return more than 1 row per row of the DataFrame. Extracting all different 'types' and then looping-concatenating seems to be ugly.

any ideas? Thanks!!!

user3635284
  • 513
  • 2
  • 8
  • 16
  • Loop concatenate is the way to go. If you are reluctant of doing so because it 'seems to be ugly', the real ugly thing here is to build a dataframe with list in columns in the first place. Avoid building such a column upfront is the right way to go and you problem will disappear nicely. – Zeugma Nov 13 '16 at 01:31
  • I inherited the DataFrame :) I'll try the loop-concat approach. – user3635284 Nov 13 '16 at 02:15
  • 1
    https://stackoverflow.com/questions/53218931/how-to-unnest-explode-a-column-in-a-pandas-dataframe-into-multiple-rows – cardamom May 19 '22 at 22:29

1 Answers1

1
import itertools
import pandas as pd
import numpy as np

def melt_series(s):
    lengths = s.str.len().values
    flat = [i for i in itertools.chain.from_iterable(s.values.tolist())]
    idx = np.repeat(s.index.values, lengths)
    return pd.Series(flat, idx, name=s.name)


melt_series(df.type).to_frame().join(df.drop('type', 1)).reindex_axis(df.columns, 1)

enter image description here


setup

df = pd.DataFrame({'key':[1,2,3],
                   'type':[[1,3],[1,2,3],[1,2]],
                   'value':[5,1,8]})
df

enter image description here

piRSquared
  • 285,575
  • 57
  • 475
  • 624