0

I have a dataset that contains 1048576 rows.
The dataset has order_id and status columns.

I want to groupby order_id and select those order_id 's whose status is 5.

order = order.groupby(['order_id']).apply(lambda x: x.where(x.status == 5))

Example order_id status id1 5 id1 3 id1 4 id1 6 id2 5 id2 3 id2 4 id2 6 It is taking too much time to execute, I want to optimize the following function. Any suggestions would be appreciated.

Asad Khalil
  • 45
  • 1
  • 10

0 Answers0