I have a dataframe containing sales data for real estate parcels. I am trying to groupby parcel number then for each parcel number see the most recent sale and the second most recent sale by date along with the corresponding sales price for those two dates.
df =
parcel date amount
101469 5/29/2015 0:00 513000
101469 4/25/2017 0:00 570000
101470 1/6/1995 0:00 75000
101470 8/15/1995 0:00 385000
101470 12/31/2001 0:00 417500
df_grouped = df.groupby("parcel").agg({'date': lambda grp: [grp.nlargest(1).iloc[-1], grp.nlargest(2).iloc[-1]
]})
The current code properly groups the data by parcel and also determines the most recent and second most recent sale dates. However, I am unable to add in the corresponding sales price for each.
Here is generally the expected result I'd like to see. One grouped by line per parcel that shows the most recent sale, second most recent sale, most recent sale amount, second most recent sale amount: