We count stock over different fruits/vegetables on arbitrary days and save the data into this dataframe:
| product | stock | date |
|--------- |------- |------------ |
| carrots | 92 | 2020-05-01 |
| carrots | 285 | 2020-05-02 |
| apples | 694 | 2020-05-01 |
| apples | 586 | 2020-05-02 |
| oranges | 58 | 2020-05-01 |
| oranges | 45 | 2020-05-04 |
Using pandas I'd like to know my most recent stock for all types of produce:
| product | stock | date |
|--------- |------- |------------ |
| carrots | 285 | 2020-05-02 |
| apples | 586 | 2020-05-02 |
| oranges | 45 | 2020-05-04 |
Heck, if there's a faster way to do it without the date, that'd be even finer!
| product | stock |
|--------- |------- |
| carrots | 285 |
| apples | 586 |
| oranges | 45 |
My no good for-loop (pseudo)-code takes minutes to run:
new_df = {}
for prod in df.product.unique():
maxdate = df[df.product == prod].date.max()
recentstock = df[(df.product == prod) & (df.date == maxdate)]
new_df[prod] = recentstock
new_df = pd.DataFrame(new_df)
Is there a nicer pandas-style solution using aggfuncs, groupby or something else to do this?