My initial DataFrame looks as follows:
A B quantity
0 1 foo 1
1 1 baz 2
2 1 bar 2
3 1 faz 1
4 2 foo 2
5 2 bar 1
6 3 foo 3
I need to group it by 'A' and make a list of 'B' multiplied by 'quantity':
A B
0 1 [foo, baz, baz, bar, bar, faz]
1 2 [foo, foo, bar]
2 3 [foo, foo, foo]
Currently I'm using groupby() and then apply():
def itemsToList(tdf, column):
collist = []
for row in tdf[column].iteritems():
collist = collist + tdf['quantity'][row[0]]*[row[1]]
return pd.Series({column: collist})
gb = df.groupby('A').apply(itemsToList, 'B')
I doubt it is an efficient way, so I'm looking for a good, "pandaic" method to achieve this.