Fairly new to Python and Pandas here.
I am trying to combine the top nth rows based on the values in a separate column into a single row with Pandas.
Using a hypothetical example, let's say I have the following table that is already sorted desc by the 'amount' column:
store_id | item | amount |
---|---|---|
00001 | shirt | 5 |
00001 | sock | 3 |
00001 | pants | 1 |
00002 | sock | 4 |
00002 | pants | 2 |
00002 | shirt | 1 |
I would like to generate a table that groups by the store_id, with each row being a list of the top n items based on the value of the 'amount' column. So if I wanted to see the top 2 items by store_id, the table would look like this:
store_id | item |
---|---|
00001 | ['shirt', 'sock'] |
00002 | ['sock', 'pants'] |
I tried following along with the suggestion here: How to combine multiple rows into a single row with pandas , however I keep running into a "'GroupedData' object is not subscriptable" error.
Would greatly appreciate any suggestions on how to solve this. Thank you in advance.