1

I have a dataframe where one of the columns is item and there is a non-unique field id. So first, I'm grouping by id:

grouped = df.groupby('id')

Now I can iterate each group like so:

for name, group in grouped:

I can also have a list of all unique items with

all_items = df['item'].unique()

What I'd like to do is for each group get a list/vector of size len(all_items) with counts according to the number of times the item appeared in the group. Basically, my main goal is to have a numpy matrix of these vectors so I can process it with scikit-learn models.

How can I do that?

desertnaut
  • 57,590
  • 26
  • 140
  • 166
IsaacLevon
  • 2,260
  • 4
  • 41
  • 83

0 Answers0