Suppose, I want to analyze item-purchase records
My analization function expects userid, item_ids
df analyze(user_id, item_ids):
..
Is it a good idea to prepare data in
user_id item_ids
1, [3,4,5]
vs
user_id, item_ids
1, 3
1, 4
1, 5
(with the 2nd one, I could do groupby and generate the data format I need)
I just find it hard to work with data format of ([1, [3,4,5]]
) than ([1,3],[1,4],[1,5]
) in intermediate steps..