Is it possible to groupby-aggregate a column based on another column?

Question

We count stock over different fruits/vegetables on arbitrary days and save the data into this dataframe:

| product   | stock     | date          |
|---------  |-------    |------------   |
| carrots   | 92        | 2020-05-01    |
| carrots   | 285       | 2020-05-02    |
| apples    | 694       | 2020-05-01    |
| apples    | 586       | 2020-05-02    |
| oranges   | 58        | 2020-05-01    |
| oranges   | 45        | 2020-05-04    |

Using pandas I'd like to know my most recent stock for all types of produce:

| product   | stock     | date          |
|---------  |-------    |------------   |
| carrots   | 285       | 2020-05-02    |
| apples    | 586       | 2020-05-02    |
| oranges   | 45        | 2020-05-04    |

Heck, if there's a faster way to do it without the date, that'd be even finer!

| product   | stock     |
|---------  |-------    |
| carrots   | 285       |
| apples    | 586       |
| oranges   | 45        |

My no good for-loop (pseudo)-code takes minutes to run:

new_df = {}
for prod in df.product.unique():
  maxdate = df[df.product == prod].date.max()
  recentstock = df[(df.product == prod) & (df.date == maxdate)]
  new_df[prod] = recentstock

new_df = pd.DataFrame(new_df)

Is there a nicer pandas-style solution using aggfuncs, groupby or something else to do this?

`df.groupby("product").tail(1)`? – Henry Yik Oct 16 '20 at 08:46 — Henry Yik, Oct 16 '20 at 08:46

score 0 · Accepted Answer · answered Oct 16 '20 at 08:52

0

You can sort values by product and date and then take the last sample in each group:

df.sort_values(['product', 'date']).groupby('product').tail(1)

answered Oct 16 '20 at 08:52

Mykola Zotko

15,583
3
71
73

Is it possible to groupby-aggregate a column based on another column?

1 Answers1