I need to find the number of days between a request date and its most recent offer date for each apartment number. My example dataframe looks like the first 3 columns below and I'm trying to figure out how to calculate the 'days_since_offer' column. The apartment and or_date columns are already sorted.
apartment offer_req_type or_date days_since_offer
A request 12/4/2019 n/a
A request 12/30/2019 n/a
A offer 3/4/2020 0
A request 4/2/2020 29
A request 6/4/2020 92
A request 8/4/2020 153
A offer 12/4/2020 0
A request 1/1/2021 28
B offer 1/1/2019 0
B request 8/1/2019 212
B offer 10/1/2019 0
B request 1/1/2020 92
B request 9/1/2020 244
B offer 1/1/2021 0
B request 1/25/2021 24
I tried to create a new function which sort of gives me what I want if I pass it dates for a single apartment. When I use the apply function is gives me an error though: "SpecificationError: Function names must be unique if there is no new column names assigned".
def func(attr, date_ser):
offer_dt = date(1900,1,1)
lapse_days = []
for row in range(len(attr)):
if attr[row] == 'offer':
offer_dt = date_ser[row]
lapse_days.append(-1)
else:
lapse_days.append(date_ser[row]-offer_dt)
print(lapse_days)
return lapse_days
df['days_since_offer'] = df.apply(func(df['offer_req_type'], df['or_date']))
I also tried to use groupby + diff functions like this and this but it's not the answer that I need:
df.groupby('offer_req_type').or_date.diff().dt.days
I also looked into using the shift method, but I'm not necessarily looking at sequential rows every time.
Any pointers on why my function is failing or if there is a better way to get the date differences that I need using a groupby method would be helpful!